如何从具有不同命名空间的子项中获取lxml.objectify项?

时间:2016-11-09 07:06:58

标签: python xml xpath lxml

我有以下python脚本:

from lxml import objectify
xml = objectify.fromstring("""<?xml version="1.0" encoding="utf-8"?>
<cfdi:Comprobante xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cfdi="http://www.sat.gob.mx/cfd/3" xsi:schemaLocation="http://www.sat.gob.mx/cfd/3 http://www.sat.gob.mx/sitio_internet/cfd/3/cfdv32.xsd">
  <cfdi:Emisor rfc="XYZU8801017YA" nombre="MOYLOP260">
    <cfdi:DomicilioFiscal calle="Calle value"/>
    <cfdi:RegimenFiscal Regimen="Regimen value" />
  </cfdi:Emisor>
  <cfdi:Complemento>
    <tfd:TimbreFiscalDigital xsi:schemaLocation="http://www.sat.gob.mx/TimbreFiscalDigital http://www.sat.gob.mx/TimbreFiscalDigital/TimbreFiscalDigital.xsd" xmlns:tfd="http://www.sat.gob.mx/TimbreFiscalDigital"
        version="1.0" UUID="UUID value"/>
  </cfdi:Complemento>
</cfdi:Comprobante>""")
print "xml.Emisor.DomicilioFiscal.get('calle'):", xml.Emisor.DomicilioFiscal.get('calle')
print "xml.Emisor.RegimenFiscal.get('Regimen'):", xml.Emisor.RegimenFiscal.get('Regimen')
tfd = xml.Complemento.xpath('tfd:TimbreFiscalDigital[1]',
                            namespaces={'tfd': 'http://www.sat.gob.mx/TimbreFiscalDigital'})
print "tfd[0].get('UUID'):", tfd[0].get('UUID')
try:
    print "xml.Complemento.TimbreFiscalDigital: ", xml.Complemento.TimbreFiscalDigital.get('UUID')
except AttributeError:
    print "Why I have a AttributeError here?"

输出结果为:

xml.Emisor.DomicilioFiscal.get('calle'): Calle value
xml.Emisor.RegimenFiscal.get('Regimen'): Regimen value
tfd[0].get('UUID'): UUID value
xml.Complemento.TimbreFiscalDigital:  Why I have a AttributeError here?

我需要从最后一个节点获取值UUID,但我不喜欢使用xpath中的硬编码xml命名空间,因为这是从xml字符串定义的。

你可以帮帮我吗?谢谢!

我是否需要更新来自儿童的命名空间?

1 个答案:

答案 0 :(得分:0)

根据http://lxml.de/objectify.html#namespace-handling,您需要在进行查找时提供子项的命名空间:

tfd = xml.Complemento["{http://www.sat.gob.mx/TimbreFiscalDigital}TimbreFiscalDigital"]

可替换地:

tfd = getattr(xml.Complemento, "{http://www.sat.gob.mx/TimbreFiscalDigital}TimbreFiscalDigital")

获取特定子元素而不指定子命名空间的唯一方法(我能想到)是使用local-name()

tfd = xml.Complemento.xpath("*[local-name() = 'TimbreFiscalDigital']")[0]
print tfd.get("UUID")