从etree删除节点但留下孩子

时间:2014-05-06 15:07:27

标签: python xml elementtree

我遍历XML树并且通过从树中提取节点离开其内部节点而遇到麻烦。

例如:

<xml>
    <letter name="B">
        <letter name="D">
            <letter name="E">
                <letter name="F">
                    <letter name="G">

                    </letter>
                </letter>
            </letter>
        </letter>
    </letter>
</xml>

我需要这样的东西:

<xml>
    <letter name="B">
        <letter name="D">
                <letter name="F">
                    <letter name="G">

                    </letter>
                </letter>
        </letter>
    </letter>
</xml>

但我不能解决所有E孩子的问题。

干杯!

3 个答案:

答案 0 :(得分:3)

我们的想法是从父级中找到letter元素name="E"get it's parentremove the element,并使用元素的子元素扩展父级:

import xml.etree.ElementTree as etree

data = """
<xml>
    <letter name="B">
        <letter name="D">
            <letter name="E">
                <letter name="F">
                    <letter name="G">

                    </letter>
                </letter>
            </letter>
        </letter>
    </letter>
</xml>
"""

XPATH = './/letter[@name="E"]'

tree = etree.fromstring(data)
letter = tree.find(XPATH)
parent = tree.find(XPATH + '/..')

parent.remove(letter)
parent.extend(letter)

print etree.tostring(tree)

打印:

<xml>
    <letter name="B">
        <letter name="D">
            <letter name="F">
                    <letter name="G">

                    </letter>
                </letter>
            </letter>
    </letter>
</xml>

UPD(使用迭代方法):

def iterparent(tree):
    for parent in tree.getiterator():
        for child in parent:
            yield parent, child

tree = etree.fromstring(data)
for parent, child in iterparent(tree):
    if child.tag == "letter" and child.attrib.get('name') == "E":
        parent.remove(child)
        parent.extend(child)

print etree.tostring(tree)

iterparent()功能取自文档中的Accessing Parents段。

答案 1 :(得分:0)

另一件事,

可以做这样的事情吗?

初始XML

<xml>
    <letter name="B">
        <letter name="D">
            <letter name="E">
                <letter name="F">
                    <letter name="G">

                    </letter>
                </letter>
            </letter>
            <letter name="H">
                <letter name="I">

                </letter>
            </letter>
        </letter>
    </letter>
</xml>

然后输出一个包含两棵树的列表,如下所示:

<xml>
    <letter name="B">
        <letter name="E">
            <letter name="F">
                <letter name="G">

                </letter>
            </letter>
        </letter>
    </letter>
</xml>


<xml>
    <letter name="B">
            <letter name="H">
                <letter name="I">

                </letter>
            </letter>
    </letter>
</xml>

正如你可以看到@falsetru和@alecxe,我刚刚删除了D,每棵树只留下一个孩子。

感谢!!!!

答案 2 :(得分:0)

我刚刚完成它,我只需要在删除之前复制树,否则原始对象将被修改..

这是解决方案。 顺便说一下!,非常感谢!!!! XD

def remove_letter(tree_original, letter):
    tree= copy.deepcopy(tree_original)
    for parent in tree.getiterator():
        for child in parent:
            if child.attrib.get('name') == letter:
                parent.remove(child)
                parent.extend(child)
                print etree.tostring(parent)
                return parent   

def get_next_trees(tree):
    my_trees = []
    for parent in tree.getiterator():
        if child.attrib.get('name') == "D":
            for child in parent:
                my_trees.append(remove_letter(tree)
            return my_trees