有没有办法将对齐文件加载到python。如果我有这样的文件:
<?xml version='1.0' encoding='utf-8' standalone='no'?>
<rdf:RDF xmlns='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:xsd='http://www.w3.org/2001/XMLSchema#'
xmlns:align='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'>
<Alignment>
<map>
<Cell>
<entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Catie+Curtis"></entity1>
<entity2 rdf:resource="http://discogs.dataincubator.org/artist/catie-curtis"></entity2>
<relation>=</relation>
<measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">1.0</measure>
</Cell>
</map>
<map>
<Cell>
<entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Bigelf"></entity1>
<entity2 rdf:resource="http://discogs.dataincubator.org/artist/bigelf"></entity2>
<relation>=</relation>
<measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.8</measure>
</Cell>
</map>
<map>
<Cell>
<entity1 rdf:resource="http://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/%C3%81kos"></entity1>
<entity2 rdf:resource="http://discogs.dataincubator.org/artist/%C3%81kos"></entity2>
<relation>=</relation>
<measure rdf:datatype="http://www.w3.org/2001/XMLSchema#float">0.9</measure>
</Cell>
</map>
</Alignment>
</rdf:RDF>
我想保持信心和三倍: 主题:HTTP://linkeddata.uriburner.com/about/id/entity//www.last.fm/music/Catie+Curtis 谓:猫头鹰:sameAs的 对象:HTTP://discogs.dataincubator.org/artist/catie-curtis 信心:1.0
我试图用RDFlib来做,但没有成功。 任何建议都会有所帮助,谢谢!
答案 0 :(得分:3)
尝试使用Redland库: http://librdf.org/docs/python.html
import RDF
parser = RDF.Parser(name="rdfxml")
model = RDF.Model()
parser.parse_into_model(model, "file:./align.rdf", None)
然后查询模型变量。例如,为了检索所有对齐并返回其度量,查询如下:
for statement in RDF.Query("SELECT ?a ?m WHERE {?a a <http://knowledgeweb.semanticweb.org/heterogeneity/alignment#Cell> ; <http://knowledgeweb.semanticweb.org/heterogeneity/alignment#measure> ?m. }",query_language="sparql").execute(model):
print "cell: %s measure:%s"%(statement['a'],statement['m'])
结果将包含字典对象的迭代器(变量名,结果),它将按如下方式打印出来:
cell: (r1301329275r1126r2) measure:1.0^^<http://www.w3.org/2001/XMLSchema#float>
cell: (r1301329275r1126r3) measure:0.8^^<http://www.w3.org/2001/XMLSchema#float>
cell: (r1301329275r1126r4) measure:0.9^^<http://www.w3.org/2001/XMLSchema#float>
可以在此处检索python中用于检索节点内容的API:http://librdf.org/docs/python.html 有关SPARQL查询语言的概述,请参阅:http://www.w3.org/TR/rdf-sparql-query/