如何使用minidom访问元素的子元素?

时间:2021-05-24 05:27:39

标签: python xml minidom

我正在 Jupyter 笔记本中读取 XML/OWL 文件(从 Protege 生成)。

我可以读取根元素,但对于儿童,它显示错误/空白。

from xml.dom.minidom import parse

DOMTree = parse("pressman.owl")
collection = DOMTree.documentElement

if collection.hasAttribute("shelf"):
   print("Root element : %s" % collection.getAttribute("owl:ObjectProperty"))

for objectprop in collection.getElementsByTagName("owl:ObjectProperty"):
    if objectprop.hasAttribute("rdf:about"):
            propertytext = objectprop.getAttribute("rdf:about")
            property = propertytext.split('#',2)
            print ("Property: %s" % property[1])
            type = objectprop.getElementsByTagName('rdf:resource')
            print ("Type: %s" % type)

pressman.owl 文件(删节):

<rdf:RDF xmlns="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6#"
     xml:base="http://www.semanticweb.org/sraza/ontologies/2021/4/untitled-ontology-6"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:PressmanOntology="urn:absolute:PressmanOntology#"
     xmlns:UniversityOntology="http://www.semanticweb.org/sraza/ontologies/2021/4/UniversityOntology#">
    <owl:Ontology rdf:about="urn:absolute:PressmanOntology"/>
    
    <!-- Object Properties -->

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasAdvice"/>

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
        <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
        <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    </owl:ObjectProperty>

    <owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDiagram">
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
        <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
        <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    </owl:ObjectProperty>

    <!-- more entries... -->    
</rdf:RDF>

输出 fis

Property: hasAdvice
Type: []
Property: hasDefinition
Type: []
Property: hasDiagram
Type: []

1 个答案:

答案 0 :(得分:1)

你有这个结构

<owl:ObjectProperty rdf:about="urn:absolute:PressmanOntology#hasDefinition">
    <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/>
    <rdfs:domain rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
    <rdfs:range rdf:resource="urn:absolute:PressmanOntology#SEPressman"/>
</owl:ObjectProperty>

你正在使用

type = objectprop.getElementsByTagName('rdf:resource')

这行不通,因为 rdf:resource 不是元素,而是属性。我假设您感兴趣的那个属于 <rdf:type>。所以我们需要再往下一层:

rdf_type = objectprop.getElementsByTagName('rdf:type')

现在rdf_type是一个节点列表——毕竟这个方法被称为“get elements by tag name”,而且minidom不能知道只有在您的情况下可能是单个 <rdf:type>。我们取第一个,如果它存在:

rdf_type = rdf_type[0] if len(rdf_type) > 0 else None

现在 rdf:resource 是该元素的一个属性。属性在 minidom 中通过 .getAttribute() 访问。

理论上,rdf:resource 属性可能在 XML 中缺失,所以让我们在使用之前确保它存在:

if rdf_type is not None and rdf_type.hasAttribute('rdf:resource'):
    rdf_resource = rdf_type.getAttribute('rdf:resource')
else:
    rdf_resource = None

print(rdf_resource)

综上所述,与其手动处理 RDF 文件,不如查看为 RDF 编写的库,例如 rdflib,甚至专门为 OWL 编写的库,例如 {{3} }.

相关问题