使用带有命名空间

时间:2017-08-25 08:56:09

标签: python xml xpath xml-parsing lxml

我有一个cim / xml格式的xml文档。该文档包含两个名称空间

  • RDF,
  • CIM。

文件的一部分如下所示:

<?xml version='1.0' encoding='UTF-8'?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:cim="http://iec.ch/TC57/2013/CIM-schema-cim16#">
  <cim:Terminal rdf:ID="_08d0270e-f753-4812-a1cc-0550d9864a23">
    <cim:IdentifiedObject.name>C:Y8CHTT402:ETTR:1</cim:IdentifiedObject.name>
    <cim:Terminal.ConductingEquipment rdf:resource="#_93030a09-6aac-46b5-bf5b-f75b90841675"/>
    <cim:ACDCTerminal.sequenceNumber>1</cim:ACDCTerminal.sequenceNumber>
  </cim:Terminal>
  <cim:Terminal rdf:ID="_5451fc7e-5d94-4d30-ab58-744ab841334d">
    <cim:IdentifiedObject.name>C:Y8CHTT402:ETTR:2</cim:IdentifiedObject.name>
    <cim:Terminal.ConductingEquipment rdf:resource="#_93030a09-6aac-46b5-bf5b-f75b90841675"/>
    <cim:ACDCTerminal.sequenceNumber>2</cim:ACDCTerminal.sequenceNumber>
  </cim:Terminal>
</rdf:RDF>

我的目标是找到一个具有给定rdf:ID 的终端etree对象。

我能够使用etree.xpath找到给定类型的所有元素。我已经找到了使用lxml documentation的方法。

from lxml import etree
root = etree.parse(my_file)
RDFNS = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
CIMNS = "http://iec.ch/TC57/2013/CIM-schema-cim16#"

all_objts = root.xpath('/y:RDF/x:Terminal' % nodeID,
                       namespaces={'x': CIMNS, 'y': RDFNS})  # This returns a list of all terminal objects

但是我没有获得只有一个给定rdf的元素:ID:

nodeID = "_08d0270e-f753-4812-a1cc-0550d9864a23"
tar_obj = root.xpath('/y:RDF/x:Terminal[@ID="%s"]' % nodeID,
                     namespaces={'x': CIMNS, 'y': RDFNS})  # Returns an empty list

我找到了a very similar post,但它没有正确回答这个问题。

我想将名称空间前缀添加到ID标记中(如下所示)

root.xpath('/y:RDF/x:PowerTransformer[y:@ID="%s"]' % nodeID,
           namespaces={'x': CIMNS, 'y': RDFNS})

,但这不起作用

File "lxml.etree.pyx", line 1507, in lxml.etree._Element.xpath (src\lxml\lxml.etree.c:52198)

  File "xpath.pxi", line 307, in lxml.etree.XPathElementEvaluator.__call__ (src\lxml\lxml.etree.c:152124)

  File "xpath.pxi", line 227, in lxml.etree._XPathEvaluatorBase._handle_result (src\lxml\lxml.etree.c:151097)

  File "xpath.pxi", line 213, in lxml.etree._XPathEvaluatorBase._raise_eval_error (src\lxml\lxml.etree.c:150950)

XPathEvalError: Invalid expression

有没有办法在具有多个名称空间的文档中使用etree.xpath搜索具有给定rdf:ID的对象?

1 个答案:

答案 0 :(得分:1)

此表达式中的谓词存在错误:

root.xpath('/y:RDF/x:PowerTransformer[y:@ID="%s"]' % nodeID,
           namespaces={'x': CIMNS, 'y': RDFNS})

您需要将[y:@ID="%s"]更改为[@y:ID="%s"]