在我的数据库中,我有三倍如下:
DocumentUri -> dc.title -> title
DocumentUri -> dc.language -> language
DocumentUri -> dc.description -> description
DocumentUri -> dc.creator -> AuthorUri
我希望能够搜索document
标题,然后从与标题搜索匹配的所有文档中获取所有属性。
我正在尝试使用Jena
和SPARQL
执行此操作。我做了一个查询,收到title
以从具有给定标题的文档中获取Uris。这是方法,它返回uris并将它们存储在名为webDocumentListInicial
的列表中:
public void searchUriByTitle() {
RDFNode documentUriNode;
String queryString = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?document WHERE { " +
"?document dc:title ?title." +
"FILTER (?title = \"" + this.getTitle() + "\" ). }";
Query query = QueryFactory.create(queryString);
QueryExecution qe = QueryExecutionFactory.create(query, databaseModel);
ResultSet results = qe.execSelect();
while( results.hasNext() ) {
QuerySolution querySolution = results.next();
documentUriNode = querySolution.get("document");
WebDocument document = new WebDocument(documentUriNode.toString());
this.webDocumentListInicial.add(document);
}
qe.close();
}
为了获取文档的创建者,我又进行了另一个查询,因为在这种情况下,来自三元组的value
是另一个资源。在这里,我迭代上面方法中填充的list
文档URI。
public void searchAuthorByTitle() {
for( WebDocument doc : this.webDocumentListInicial ) {
RDFNode authorUriNode;
String queryString = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?author WHERE { " +
"?document dc:creator ?author." +
"FILTER (?document = <" + doc.getUri() + "> ). }";
Query query = QueryFactory.create(queryString);
QueryExecution qe = QueryExecutionFactory.create(query, databaseModel);
ResultSet results = qe.execSelect();
while( results.hasNext() ) {
QuerySolution querySolution = results.next();
authorUriNode = querySolution.get("author");
WebAuthor author;
author = this.searchAuthorProperties(authorUriNode.toString(), new WebAuthor(authorUriNode.toString()) );
doc.addAuthor(author);
}
qe.close();
}
}
为了获得其他文档属性,我喜欢在下面的示例中,我迭代在上面显示的第一个方法中填充的list
。
public void searchDescription() {
for( WebDocument doc : this.webDocumentListInicial ) {
String description = "";
Resource resource = ResourceFactory.createResource(doc.getUri());
StmtIterator descriptionStmtIt = databaseModel.listStatements(resource, DC.description,(RDFNode) null);
while( descriptionStmtIt.hasNext() ) {
description = descriptionStmtIt.next().getObject().toString();
}
doc.setDescription(description);
}
}
这样我处理数据的效率不高,因为我需要为每个属性提供不同的查询。
是否可以只进行一次查询以同时获取文档URI和所有其他文档的属性?我试过一次,就像这样:
String queryString = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?document ?description " +
"?language ?author WHERE { " +
"?document dc:title ?title." +
"?document dc.language ?language" +
"?document dc.description ?description" +
"?document dc.creator ?author" +
"FILTER (?title = \"" + this.getTitle() + "\" ). }";
但是当我有多个与给定标题匹配的文档时,很难知道返回的属性属于每个文档。
谢谢!
答案 0 :(得分:4)
听起来你做的工作比你需要的要多得多。如果你有这样的数据:
@prefix : <http://stackoverflow.com/q/20436820/1281433/>
:doc1 :title "Title1" ; :author :author1 ; :date "date-1" .
:doc2 :title "Title2" ; :author :author2 ; :date "date-2" .
:doc3 :title "Title3" ; :author :author3 ; :date "date-3" .
:doc4 :title "Title4" ; :author :author4 ; :date "date-4" .
:doc5 :title "Title5" ; :author :author5 ; :date "date-5" .
标题列表,比如"Title1" "Title4" "Title5"
,你想要检索每个标题的文档资源,以及相关的作者和日期,你可以使用这样的查询:
prefix : <http://stackoverflow.com/q/20436820/1281433/>
select ?document ?author ?date where {
values ?title { "Title1" "Title4" "Title5" }
?document :title ?title ;
:author ?author ;
:date ?date .
}
您将在一个ResultSet中获得这样的结果。没有必要进行多次查询。
----------------------------------
| document | author | date |
==================================
| :doc1 | :author1 | "date-1" |
| :doc4 | :author4 | "date-4" |
| :doc5 | :author5 | "date-5" |
----------------------------------
根据您的评论,听起来您需要从ResultSet构建其他类型的关联结构。这是一种可以构造Map<RDFNode,Map<String,RDFNode>>
的方法,它将每个文档IRI带到另一个映射,该映射将每个命名的变量都带到相关的值。
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.RDFNode;
public class HashedResultsExample {
final static String DATA =
"@prefix : <http://stackoverflow.com/q/20436820/1281433/>\n" +
"\n" +
":doc1 :title 'Title1' ; :author :author1 ; :date 'date-1' .\n" +
":doc2 :title 'Title2' ; :author :author2 ; :date 'date-2' .\n" +
":doc3 :title 'Title3' ; :author :author3 ; :date 'date-3' .\n" +
":doc4 :title 'Title4' ; :author :author4 ; :date 'date-4' .\n" +
":doc5 :title 'Title5' ; :author :author5 ; :date 'date-5' .\n" ;
final static String QUERY =
"prefix : <http://stackoverflow.com/q/20436820/1281433/>\n" +
"select ?document ?author ?date where {\n" +
" values ?title { \"Title1\" \"Title4\" \"Title5\" }\n" +
" ?document :title ?title ; :author ?author ; :date ?date .\n" +
"}" ;
public static void main(String[] args) throws IOException {
final Model model = ModelFactory.createDefaultModel();
try ( final InputStream in = new ByteArrayInputStream( DATA.getBytes() )) {
model.read( in, null, "TTL" );
}
final ResultSet rs = QueryExecutionFactory.create( QUERY, model ).execSelect();
final Map<RDFNode,Map<String,RDFNode>> map = new HashMap<>();
while ( rs.hasNext() ) {
final QuerySolution qs = rs.next();
final Map<String,RDFNode> rowMap = new HashMap<>();
for ( final Iterator<String> varNames = qs.varNames(); varNames.hasNext(); ) {
final String varName = varNames.next();
rowMap.put( varName, qs.get( varName ));
}
map.put( qs.get( "document" ), rowMap );
}
System.out.println( map );
}
}
输出(因为地图在末尾打印)带有一些可读性换行符:
{http://stackoverflow.com/q/20436820/1281433/doc4=
{author=http://stackoverflow.com/q/20436820/1281433/author4,
document=http://stackoverflow.com/q/20436820/1281433/doc4,
date=date-4},
http://stackoverflow.com/q/20436820/1281433/doc1=
{author=http://stackoverflow.com/q/20436820/1281433/author1,
document=http://stackoverflow.com/q/20436820/1281433/doc1,
date=date-1},
http://stackoverflow.com/q/20436820/1281433/doc5=
{author=http://stackoverflow.com/q/20436820/1281433/author5,
document=http://stackoverflow.com/q/20436820/1281433/doc5,
date=date-5}}