使用RDFlib获取Literal对象的lang值

时间:2014-04-12 00:00:58

标签: python rdflib

我有这个rdf文件:

<!DOCTYPE rdf:RDF [
    <!ENTITY db "http://dbpedia.org/ontology/" >
    <!ENTITY owl "http://www.w3.org/2002/07/owl#" >
    <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#" >
    <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#" >
    <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#" >]>

<rdf:RDF xmlns="http://dbpedia.org/ontology/"
     xml:base="http://dbpedia.org/ontology/"
     xmlns:db="http://dbpedia.org/ontology/"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">


    <owl:ObjectProperty rdf:about="&db;actingHeadteacher">
        <rdfs:label xml:lang="el">διευθυντής σχολείου</rdfs:label>
        <rdfs:label xml:lang="en">acting headteacher</rdfs:label>
    </owl:ObjectProperty>
</rdf:RDF>

并希望按其lang值过滤Literal对象。例如:

from rdflib import Graph
from rdflib.namespace import RDFS
filetype = util.guess_format(rdf_file)
g = Graph()
g.parse(rdf_file, format = filetype)
for s,p,o in g.triples((None, RDFS.label, None)):
    print(repr(o))  # rdflib.term.Literal('acting headteacher', lang='en')
                    # rdflib.term.Literal('διευθυντής σχολείου', lang='el')

我想仅在lang =&#39; en&#39;

的情况下查询o

2 个答案:

答案 0 :(得分:4)

当您检查manual for rdflib时,您会发现rdflib.term.Literal有一个名为language的属性以及一种方法。但是,调用该方法似乎并不适合我。

这样的事情可以做到:

# from rdflib import URIRef

subject = URIRef('&db;actingHeadteacher')

# just getting your literals directly here:
generator = graph.objects(subject, RDFS.label)

for lit in generator:
    print lit.language

labelpreferredLabel

如果您只对标签/首选标签(SKOS或RDFS)感兴趣,请检查page 47 from the manual

subject = URIRef('&db;actingHeadteacher')
graph.preferredLabel(subject=subject, label='en') # or label='el'

这会返回(labelProp, label)对的列表,其中labelPropskos:prefLabelrdfs:label

答案 1 :(得分:0)

在rdflib中可能有更优雅/高性能的解决方案,但您可以使用SPARQL查询:

g = Graph()
g.parse("../stw.nt", format="nt")

qres = g.query(
    """SELECT ?label
        WHERE {
            ?s ?p ?label
            FILTER langMatches( lang(?label), "en" )
        }"""
)

for row in qres:
    print(row.label)