当子元素作为父元素位于不同的命名空间时,我试图在ElementTree或lxml中获得名称空间的紧凑表示。这是基本的例子:
from lxml import etree
country = etree.Element("country")
name = etree.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = etree.SubElement(country, "{urn:test}population")
population.text = "34M"
etree.register_namespace('tst', 'urn:test')
print( etree.tostring(country, pretty_print=True) )
我也试过这种方法:
ns = {"test" : "urn:test"}
country = etree.Element("country", nsmap=ns)
name = etree.SubElement(country, "{test}name")
name.text = "Canada"
population = etree.SubElement(country, "{test}population")
population.text = "34M"
print( etree.tostring(country, pretty_print=True) )
在这两种情况下,我都会得到这样的结论:
<country>
<ns0:name xmlns:ns0="urn:test">Canada</ns0:name>
<ns1:population xmlns:ns1="urn:test">34M</ns1:population>
</country>
虽然这是正确的,但我希望它不那么冗长 - 这可能成为大数据集的真正问题(特别是因为我使用比'urn:test'更大的NS)。
如果我可以将'country'放在“urn:test”命名空间内并声明它(在上面的第一个例子中):
country = etree.Element("{test}country")
然后我得到以下输出:
<ns0:country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
</ns0:country>
但我真正想要的是:
<country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
<country>
有什么想法吗?
答案 0 :(得分:2)
元素的全名包含{namespace-url}elementName
,而不是{prefix}elementName
>>> from lxml import etree as ET
>>> r = ET.Element('root', nsmap={'tst': 'urn:test'})
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x2592a80>
>>> ET.tostring(r)
'<root xmlns:tst="urn:test"><tst:child/></root>'
在您的情况下,如果更新默认命名空间,则可能更紧凑的表示形式。遗憾的是,lxml
似乎不允许空XML命名空间,但是你可以说,你可以将父标记放在与子元素相同的命名空间中,这样你就可以将dafault命名空间设置为子元素的命名空间:
>>> r = ET.Element('{urn:test}root', nsmap={None: 'urn:test'})
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x2592b20>
>>> ET.SubElement(r, "{urn:test}child")
<Element {urn:test}child at 0x25928f0>
>>> ET.tostring(r)
'<root xmlns="urn:test"><child/><child/></root>'
答案 1 :(得分:1)
此代码:
from lxml import etree
ns = {"ns0" : "urn:test"}
country = etree.Element("country", nsmap=ns)
name = etree.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = etree.SubElement(country, "{urn:test}population")
population.text = "34M"
print(etree.tostring(country, pretty_print=True))
似乎提供了所需的输出:
<country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
</country>
但您仍然需要自己维护nsmap
。
答案 2 :(得分:1)
from xml.etree import cElementTree as ET
##ET.register_namespace('tst', 'urn:test')
country = ET.Element("country")
name = ET.SubElement(country, "{urn:test}name")
name.text = "Canada"
population = ET.SubElement(country, "{urn:test}population")
population.text = "34M"
print prettify(country)
上面的将给出(没有注册任何命名空间):
<?xml version="1.0" ?>
<country xmlns:ns0="urn:test">
<ns0:name>Canada</ns0:name>
<ns0:population>34M</ns0:population>
</country>
而且,当我删除了注释部分时,它将给出::
<?xml version="1.0" ?>
<country xmlns:tst="urn:test">
<tst:name>Canada</tst:name>
<tst:population>34M</tst:population>
</country>
注意:prettify
函数为here