Question

我使用etree来递归xml文件。

import xml.etree.ElementTree as etree
tree = etree.parse('x.xml')
root = tree.getroot()
for child in root[0]:
 for child in child.getchildren():
        for child in child.getchildren():
            for child in child.getchildren():
               print(child.attrib)

python中避免这些嵌套for循环的惯用方法是什么。

  getchildren() ⇒ list of Element instances [#]
    Returns all subelements. The elements are returned in document order.

Returns:
A list of subelements.

我在SO中看到了一些帖子， Avoiding nested for loops 但并没有直接转化为我的使用。

感谢。

Answer 1

如果你想让树中深度为n级的孩子，然后遍历它们，你可以这样做：

def childrenAtLevel(tree, n):
    if n == 1:
        for child in tree.getchildren():
            yield child
    else:
        for child in tree.getchildren():
            for e in childrenAtLevel(child, n-1):
                yield e

然后，为了让元素深入四层，你只需说：

for e in childrenAtLevel(root, 4):
     # do something with e

或者，如果您想获取所有叶子节点（即自己没有子节点的节点），您可以这样做：

def getLeafNodes(tree):
    if len(tree) == 0:
         yield tree
    else:
         for child in tree.getchildren():
            for leaf in getLeafNodes(child):
                yield leaf

Answer 2

itertools.chain.from_iterable会使一级嵌套变平;您可以使用functools.reduce来应用 n 次（Compressing "n"-time object member call）：

from itertools import chain
from functools import reduce

for child in reduce(lambda x, _: chain.from_iterable(x), range(3), root):
    print(child.attrib)

请注意，getchildren已弃用;迭代节点会直接产生子节点。

Python：避免在数组上嵌套循环

2 个答案: