我正在尝试从DBpedia中为某些人提取标签。我现在部分成功,但我遇到了以下问题。以下代码有效。
public class DbPediaQueryExtractor {
public static void main(String [] args) {
String entity = "Aharon_Barak";
String queryString ="PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE {dbres:"+ entity+ "<http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\"))}";
//String queryString="select * where { ?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>; <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\")) } LIMIT 5000000";
QueryExecution qexec = getResult(queryString);
try {
ResultSet results = qexec.execSelect();
for ( ; results.hasNext(); )
{
QuerySolution soln = results.nextSolution();
System.out.print(soln.get("?o") + "\n");
}
}
finally {
qexec.close();
}
}
public static QueryExecution getResult(String queryString){
Query query = QueryFactory.create(queryString);
//VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create (sparql, graph);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
return qexec;
}
}
但是,当实体包含括号时,它不起作用。例如,
String entity = "William_H._Miller_(writer)";
导致此异常:
线程“main”中的异常com.hp.hpl.jena.query.QueryParseException:遇到“”(“”(“”第1行,第86列。
有什么问题?
答案 0 :(得分:6)
需要一些复制和粘贴才能看到究竟发生了什么。我建议您在查询中添加换行符以便于阅读。您正在使用的查询是:
PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:??? <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
其中???
被字符串entity
的内容替换。您这里绝对没有输入验证,以确保entity
的值合法粘贴。根据您的问题,听起来entity
包含William_H._Miller_(writer)
,所以你'重新获得查询:
PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
您可以将其粘贴到public DBpedia endpoint中,您将收到类似的解析错误消息:
Virtuoso 37000 Error SP030: SPARQL compiler, line 6: syntax error at 'writer' before ')'
SPARQL query:
define sql:big-data-const 0
#output-format:text/html
define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
比使用错误查询命中DBpedia的端点更好,您还可以使用the SPARQL query validator来报告该查询:
语法错误:第4行第34列的词法错误。遇到:“)”(41),之后:“writer”
在Jena中,您可以使用ParameterizedSparqlString来避免这些问题。这是你的例子,重新使用参数化字符串:
import com.hp.hpl.jena.query.ParameterizedSparqlString;
public class PSSExample {
public static void main( String[] args ) {
// Create a parameterized SPARQL string for the particular query, and add the
// dbres prefix to it, for later use.
final ParameterizedSparqlString queryString = new ParameterizedSparqlString(
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n" +
"SELECT * WHERE\n" +
"{\n" +
" ?entity rdfs:label ?o\n" +
" FILTER (langMatches(lang(?o),\"en\"))\n" +
"}\n"
) {{
setNsPrefix( "dbres", "http://dbpedia.org/resource/" );
}};
// Entity is the same.
final String entity = "William_H._Miller_(writer)";
// Now retrieve the URI for dbres, concatentate it with entity, and use
// it as the value of ?entity in the query.
queryString.setIri( "?entity", queryString.getNsPrefixURI( "dbres" )+entity );
// Show the query.
System.out.println( queryString.toString() );
}
}
输出结果为:
PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
<http://dbpedia.org/resource/William_H._Miller_(writer)> rdfs:label ?o
FILTER (langMatches(lang(?o),"en"))
}
您可以在公共端点运行此查询并获取the expected results。请注意,如果您使用不需要特殊转义的entity
,例如
final String entity = "George_Washington";
然后查询输出将使用前缀形式:
PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
dbres:George_Washington rdfs:label ?o
FILTER (langMatches(lang(?o),"en"))
}
这非常方便,因为您不必执行任何检查您的后缀(即entity
)是否包含任何需要转义的字符;耶拿为你照顾好。