Python xml绝对路径

时间:2011-12-12 13:53:45

标签: python xml

如何使用python打印/转储XML文档的“绝对路径”和值?

例如:

<A>
  <B>foo</B>
  <C>
    <D>On</D>
  </C>
  <E>Auto</E>
  <F>
    <G>
      <H>shoo</H>
      <I>Off</I>
    </G>
  </F>
</A>

/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off

4 个答案:

答案 0 :(得分:2)

from lxml import etree
root = etree.XML(your_xml_string)

def print_path_of_elems(elem, elem_path=""):
    for child in elem:
        if not child.getchildren() and child.text:
            # leaf node with text => print
            print "%s/%s, %s" % (elem_path, child.tag, child.text)
        else:
            # node with child elements => recurse
            print_path_of_elems(child, "%s/%s" % (elem_path, child.tag))

print_path_of_elems(root, root.tag)

答案 1 :(得分:2)

另一种方法是:

from lxml import etree

XMLDoc = etree.parse(open('file.xml'))

for Node in XMLDoc.xpath('//*'):
    if not Node.getchildren() and Node.text:
        print XMLDoc.getpath(Node), Node.text

根据文档的结构,您可能会在xpath中获得可能需要删除的节点编号。

答案 2 :(得分:0)

这样的事情对你有用:

from xml.etree.ElementTree import ElementTree

tree = ElementTree()
tree.parse(open('file.xml'))
root = tree.getroot()

def print_abs_path(root, path=None):
    if path is None:
        path = [root.tag]

    for child in root:
        text = child.text.strip()
        new_path = path[:]
        new_path.append(child.tag)
        if text:
            print '/{0}, {1}'.format('/'.join(new_path), text)
        print_abs_path(child, new_path)

print_abs_path(root)

答案 3 :(得分:0)

完全低效的xpath解决方案:

>>> from lxml import etree
>>> tree = etree.fromstring("""
... <A>
...   <B>foo</B>
...   <C>
...     <D>On</D>
...   </C>
...   <E>Auto</E>
...   <F>
...     <G>
...       <H>shoo</H>
...       <I>Off</I>
...     </G>
...   </F>
... </A>
... """)
>>> for node in tree.xpath('//*[normalize-space(text())]'):
...     print '/%s, %s' % (
...         '/'.join(a.tag for a in node.xpath('.//ancestor::*')), node.text)
... 
/A/B, foo
/A/C/D, On
/A/E, Auto
/A/F/G/H, shoo
/A/F/G/I, Off