选择RDF集合/列表并使用Jena迭代结果

时间:2013-07-17 11:18:48

标签: list rdf resultset sparql jena

对于某些像这样的RDF:

<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:blah="http://www.something.org/stuff#">
<rdf:Description rdf:about="http://www.something.org/stuff/some_entity1">
<blah:stringid>string1</blah:stringid>
<blah:uid>1</blah:uid>
<blah:myitems rdf:parseType="Collection">
  <blah:myitem>
        <blah:myitemvalue1>7</blah:myitemvalue1>
        <blah:myitemvalue2>8</blah:myitemvalue2>
     </blah:myitem>
...
    <blah:myitem>
     <blah:myitemvalue1>7</blah:myitemvalue1>
        <blah:myitemvalue2>8</blah:myitemvalue2>
    </blah:myitem>
</blah:myitems>
</rdf:Description>

<rdf:Description rdf:about="http://www.something.org/stuff/some__other_entity2">
<blah:stringid>string2</blah:stringid>
<blah:uid>2</blah:uid>
<blah:myitems rdf:parseType="Collection">
    <blah:myitem>
        <blah:myitemvalue1>7</blah:myitemvalue1>
        <blah:myitemvalue2>8</blah:myitemvalue2>
     </blah:myitem>
....
    <blah:myitem>
        <blah:myitemvalue1>7</blah:myitemvalue1>
        <blah:myitemvalue2>8</blah:myitemvalue2>
    </blah:myitem>
</blah:myitems>
</rdf:Description>
</rdf:RDF>

我正在使用Jena / SPARQL,我希望能够使用SELECT查询为具有特定myitems的实体检索stringid节点,然后从结果集中提取它并迭代并获取每个myitem节点的值。订单并不重要。

所以我有两个问题:

  1. 我是否需要在查询中指定blah:myitems是一个列表?
  2. 如何解析ResultSet中的列表?

1 个答案:

答案 0 :(得分:9)

在SPARQL中选择列表(和元素)

让我们先解决SPARQL问题。我已经修改了一些数据,以便元素具有不同的值,因此在输出中更容易看到它们。这是N3格式的数据,它更简洁,特别是在表示列表时:

@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix blah:    <http://www.something.org/stuff#> .

<http://www.something.org/stuff/some_entity1>
      blah:myitems ([ a       blah:myitem ;
                  blah:myitemvalue1 "1" ;
                  blah:myitemvalue2 "2"
                ] [ a       blah:myitem ;
                  blah:myitemvalue1 "3" ;
                  blah:myitemvalue2 "4"
                ]) ;
      blah:stringid "string1" ;
      blah:uid "1" .

<http://www.something.org/stuff/some__other_entity2>
      blah:myitems ([ a       blah:myitem ;
                  blah:myitemvalue1 "5" ;
                  blah:myitemvalue2 "6"
                ] [ a       blah:myitem ;
                  blah:myitemvalue1 "7" ;
                  blah:myitemvalue2 "8"
                ]) ;
      blah:stringid "string2" ;
      blah:uid "2" .

您在选择myitems节点的问题中提到过,但myitems实际上是将实体与列表相关联的属性。您可以在SPARQL中选择属性,但我猜您实际上想要选择列表的头部,即myitems属性的值。这很简单。您无需指定它是rdf:List,但如果myitems的值也可能是非列表,那么您应该指定您只查找rdf:Lists 。 (为了开发SPARQL查询,我将使用Jena的ARQ命令行工具运行它们,因为我们可以在以后很容易地将它们移动到Java代码。)

prefix blah: <http://www.something.org/stuff#> 

select ?list where { 
  [] blah:myitems ?list .
}
$ arq --data data.n3 --query items.sparql
--------
| list |
========
| _:b0 |
| _:b1 |
--------

列表的头部是空白节点,因此这是我们期望的那种结果。从这些结果中,您可以从结果集中获取资源,然后开始向下遍历列表,但由于您不关心列表中节点的顺序,您可以在SPARQL查询中选择它们,然后遍历结果集,获取每个项目。您可能对您正在检索其项目的实体感兴趣,因此也在此查询中。

prefix blah:    <http://www.something.org/stuff#> 
prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?entity ?list ?item ?value1 ?value2 where { 
  ?entity blah:myitems ?list .
  ?list rdf:rest* [ rdf:first ?item ] .
  ?item a blah:myitem ;
        blah:myitemvalue1 ?value1 ;
        blah:myitemvalue2 ?value2 .
}
order by ?entity ?list
$ arq --data data.n3 --query items.sparql
----------------------------------------------------------------------------------------
| entity                                               | list | item | value1 | value2 |
========================================================================================
| <http://www.something.org/stuff/some__other_entity2> | _:b0 | _:b1 | "7"    | "8"    |
| <http://www.something.org/stuff/some__other_entity2> | _:b0 | _:b2 | "5"    | "6"    |
| <http://www.something.org/stuff/some_entity1>        | _:b3 | _:b4 | "3"    | "4"    |
| <http://www.something.org/stuff/some_entity1>        | _:b3 | _:b5 | "1"    | "2"    |
----------------------------------------------------------------------------------------

通过按实体和列表排序结果(如果某个实体具有myitems属性的多个值),您可以遍历结果集并确保按顺序获取所有元素实体列表。由于您的问题是关于结果集中的列表,而不是关于如何使用结果集,我将假设迭代结果不是问题。

使用Jena中的列表

以下示例说明如何使用Java中的列表。代码的第一部分只是加载模型并运行SPARQL查询的样板。一旦获得了查询结果,就可以将资源视为链表的头部,并使用rdf:firstrdf:rest属性手动迭代,或者可以转换资源到Jena的RDFList并从中获取迭代器。

import java.io.IOException;
import java.io.InputStream;

import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFList;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.util.iterator.ExtendedIterator;
import com.hp.hpl.jena.vocabulary.RDF;

public class SPARQLListExample {
    public static void main(String[] args) throws IOException {
        // Create a model and load the data
        Model model = ModelFactory.createDefaultModel();
        try ( InputStream in = SPARQLListExample.class.getClassLoader().getResourceAsStream( "SPARQLListExampleData.rdf" ) ) {
            model.read( in, null );
        }
        String blah = "http://www.something.org/stuff#";
        Property myitemvalue1 = model.createProperty( blah + "myitemvalue1" );
        Property myitemvalue2 = model.createProperty( blah + "myitemvalue2" );

        // Run the SPARQL query and get some results
        String getItemsLists = "" +
                "prefix blah: <http://www.something.org/stuff#>\n" +
                "\n" +
                "select ?list where {\n" +
                "  [] blah:myitems ?list .\n" +
                "}";
        ResultSet results = QueryExecutionFactory.create( getItemsLists, model ).execSelect();

        // For each solution in the result set
        while ( results.hasNext() ) {
            QuerySolution qs = results.next();
            Resource list = qs.getResource( "list" ).asResource();
            // Once you've got the head of the list, you can either process it manually 
            // as a linked list, using RDF.first to get elements and RDF.rest to get 
            // the rest of the list...
            for ( Resource curr = list;
                  !RDF.nil.equals( curr );
                  curr = curr.getRequiredProperty( RDF.rest ).getObject().asResource() ) {
                Resource item = curr.getRequiredProperty( RDF.first ).getObject().asResource();
                RDFNode value1 = item.getRequiredProperty( myitemvalue1 ).getObject();
                RDFNode value2 = item.getRequiredProperty( myitemvalue2 ).getObject();
                System.out.println( item+" has:\n\tvalue1: "+value1+"\n\tvalue2: "+value2 );
            }
            // ...or you can make it into a Jena RDFList that can give you an iterator
            RDFList rdfList = list.as( RDFList.class );
            ExtendedIterator<RDFNode> items = rdfList.iterator();
            while ( items.hasNext() ) {
                Resource item = items.next().asResource();
                RDFNode value1 = item.getRequiredProperty( myitemvalue1 ).getObject();
                RDFNode value2 = item.getRequiredProperty( myitemvalue2 ).getObject();
                System.out.println( item+" has:\n\tvalue1: "+value1+"\n\tvalue2: "+value2 );
            }
        }
    }
}