从python中的etree中删除元素兄弟

时间:2014-05-13 18:27:25

标签: python xml elementtree xml.etree

我试图删除给定元素的所有兄弟姐妹:

例如,给定此etree对象

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="C">
                    </letter>
                    <letter name="D">
                    </letter>
                    <letter name="G">
                    </letter>
                    <letter name="H">
                    </letter>
                    <letter name="I">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

我想删除所有G节点的兄弟节点并返回:

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="G">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

不使用xpath或以迭代方式查找。

你能提供一些如何做的提示吗?

这是我刚写的代码

import xml.etree.ElementTree as etree
data = """

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="C">
                    </letter>
                    <letter name="D">
                    </letter>
                    <letter name="G">
                    </letter>
                    <letter name="H">
                    </letter>
                    <letter name="I">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

"""
tree =etree.fromstring(data)


for parent in tree.getiterator():
    for child in parent:
        for subchild in child:
            if subchild.attrib.get('name') == "G":
                parent_name = child.attrib.get('name')
                #print parent_name

for parent in tree.getiterator():
    if parent.attrib.get('name') == parent_name:
        for child in parent:
            if child.attrib.get('name') == "G":
                print "not this"
            else:
                parent.remove(child)


print etree.tostring(tree)

干杯!

1 个答案:

答案 0 :(得分:1)

你很亲密。一旦找到名称G,您将需要重复包含名称G的任何元素。因此,您需要在这些行中使用更多内容(根据您的要求使用迭代而不是xpath或find):

>>> def remove(name, value, root):
    """
    Iterates through the @root element and removes elements
    where the @name != @value.
    """
    for element in root:
        if element.attrib.get(name) != value:
            root.remove(element)


>>> def remove_siblings_of(name, value, root):
    """
    Recursively removes from the @root element all elements which (1) do
    not have @name == @value but (2) do have a sibling where @name == @value.
    """
    for element in root:
        if element.attrib.get(name) == value:
            remove(name, value, root)  # need to reiterate through element now to remove previous siblings
        if len(element):
            remove_siblings_of(name, value, element)
    return root

当你在xml上使用后一个函数时,你将得到你正在寻找的结果:

>>> siblings_removed = remove_siblings_of('name', 'G', root)
>>> print et.tostring(siblings_removed)
<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="G">
                    </letter>
                    </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>