如何使用python打印/转储XML文档的“绝对路径”和值?
例如:
<A>
<B>foo</B>
<C>
<D>On</D>
</C>
<E>Auto</E>
<F>
<G>
<H>shoo</H>
<I>Off</I>
</G>
</F>
</A>
到
/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off
答案 0 :(得分:2)
from lxml import etree
root = etree.XML(your_xml_string)
def print_path_of_elems(elem, elem_path=""):
for child in elem:
if not child.getchildren() and child.text:
# leaf node with text => print
print "%s/%s, %s" % (elem_path, child.tag, child.text)
else:
# node with child elements => recurse
print_path_of_elems(child, "%s/%s" % (elem_path, child.tag))
print_path_of_elems(root, root.tag)
答案 1 :(得分:2)
另一种方法是:
from lxml import etree
XMLDoc = etree.parse(open('file.xml'))
for Node in XMLDoc.xpath('//*'):
if not Node.getchildren() and Node.text:
print XMLDoc.getpath(Node), Node.text
根据文档的结构,您可能会在xpath中获得可能需要删除的节点编号。
答案 2 :(得分:0)
这样的事情对你有用:
from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse(open('file.xml'))
root = tree.getroot()
def print_abs_path(root, path=None):
if path is None:
path = [root.tag]
for child in root:
text = child.text.strip()
new_path = path[:]
new_path.append(child.tag)
if text:
print '/{0}, {1}'.format('/'.join(new_path), text)
print_abs_path(child, new_path)
print_abs_path(root)
答案 3 :(得分:0)
完全低效的xpath解决方案:
>>> from lxml import etree
>>> tree = etree.fromstring("""
... <A>
... <B>foo</B>
... <C>
... <D>On</D>
... </C>
... <E>Auto</E>
... <F>
... <G>
... <H>shoo</H>
... <I>Off</I>
... </G>
... </F>
... </A>
... """)
>>> for node in tree.xpath('//*[normalize-space(text())]'):
... print '/%s, %s' % (
... '/'.join(a.tag for a in node.xpath('.//ancestor::*')), node.text)
...
/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off