Question

这是样本数据。

input.xml中

<root>
    <entry id="1">
    <headword>go</headword>
    <example>I <hw>go</hw> to school.</example>
</entry>
</root>

我想将节点及其后代放入。也就是说，

的Output.xml

<root>
    <entry id="1">
    <headword>go</headword>
            <examplegrp>
                <example>I <hw>go</hw> to school.</example>
            </examplegrp>
</entry>
</root>

我糟糕且不完整的剧本是：

import codecs
import xml.etree.ElementTree as ET

fin = codecs.open(r'input.xml', 'rb', encoding='utf-8')

data = ET.parse(fin)
root = data.getroot()

example = root.find('.//example')
for elem in example.iter():
    ---and then I don't know what to do---

Answer 1

http://docs.python.org/3/library/xml.dom.html?highlight=xml#node-objects http://docs.python.org/3/library/xml.dom.html?highlight=xml#document-objects

您可能希望遵循一些创建文档元素并将覆盖率结果附加到其中的范例。

group = Document.createElement(tagName)
for found in founds:
    group.appendNode(found)

或类似的东西

Answer 2

以下是如何完成的示例：

text = """
<root>
    <entry id="1">
        <headword>go</headword>
        <example>I <hw>go</hw> to school.</example>
    </entry>
</root>
"""

import lxml.etree
import StringIO

data = lxml.etree.parse(StringIO.StringIO(text))
root = data.getroot()

for entry in root.xpath('//example/ancestor::entry[1]'):
    examplegrp = lxml.etree.SubElement(entry,"examplegrp")
    nodes = [node for node in entry.xpath('./example')]
    for node in nodes:
        entry.remove(node)
        examplegrp.append(node)

print lxml.etree.tostring(root,pretty_print=True)

将输出：

<root>
    <entry id="1">
        <headword>go</headword>
        <examplegrp><example>I <hw>go</hw> to school.</example>
    </examplegrp></entry>
</root>

如何在Python 3.3中使用ElementTree选择某个元素的所有后代？

2 个答案: