是否可以使用lxml插入具有正确名称空间的XML属性?
例如,我想使用XLink在XML文档中插入链接。我需要做的就是在某些元素中插入{http://www.w3.org/1999/xlink}href
属性。我想使用xlink
前缀,但是lxml会生成诸如“ ns0”,“ ns1”等前缀。
这是我尝试过的:
from lxml import etree
#: Name (and namespace) of the *href* attribute use to insert links.
HREF_ATTR = etree.QName("http://www.w3.org/1999/xlink", "href").text
content = """\
<body>
<p>Link to <span>StackOverflow</span></p>
<p>Link to <span>Google</span></p>
</body>
"""
targets = ["https://stackoverflow.com", "https://www.google.fr"]
body_elem = etree.XML(content)
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target
etree.dump(body_elem)
转储如下:
<body>
<p>link to <span xmlns:ns0="http://www.w3.org/1999/xlink"
ns0:href="https://stackoverflow.com">stackoverflow</span></p>
<p>link to <span xmlns:ns1="http://www.w3.org/1999/xlink"
ns1:href="https://www.google.fr">google</span></p>
</body>
我找到了一种通过在根元素中插入和删除属性来分解名称空间的方法,如下所示:
# trick to declare the XLink namespace globally (only one time).
body_elem = etree.XML(content)
body_elem.attrib[HREF_ATTR] = ""
del body_elem.attrib[HREF_ATTR]
targets = ["https://stackoverflow.com", "https://www.google.fr"]
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target
etree.dump(body_elem)
它很丑,但是它可以工作,我只需要做一次。我得到:
<body xmlns:ns0="http://www.w3.org/1999/xlink">
<p>Link to <span ns0:href="https://stackoverflow.com">StackOverflow</span></p>
<p>Link to <span ns0:href="https://www.google.fr">Google</span></p>
</body>
但是问题仍然存在:如何将这个“ ns0”前缀转换为“ xlink”?
答案 0 :(得分:1)
按照@mzjn的建议使用register_namespace
:
etree.register_namespace("xlink", "http://www.w3.org/1999/xlink")
# trick to declare the XLink namespace globally (only one time).
body_elem = etree.XML(content)
body_elem.attrib[HREF_ATTR] = ""
del body_elem.attrib[HREF_ATTR]
targets = ["https://stackoverflow.com", "https://www.google.fr"]
for span_elem, target in zip(body_elem.iter("span"), targets):
span_elem.attrib[HREF_ATTR] = target
etree.dump(body_elem)
结果是我所期望的:
<body xmlns:xlink="http://www.w3.org/1999/xlink">
<p>Link to <span xlink:href="https://stackoverflow.com">StackOverflow</span></p>
<p>Link to <span xlink:href="https://www.google.fr">Google</span></p>
</body>