Question

我已经围绕这个问题做了一些研究，但实际上还没有能够提出任何有用的东西。我需要的不仅是解析和阅读，而是实际操作python中的XML文档，类似于JavaScript能够操作HTML文档的方式。

请允许我举个例子。说我有以下XML文档：

<library>
    <book id=123>
        <title>Intro to XML</title>
        <author>John Smith</author>
        <year>1996</year>
    </book>
    <book id=456>
        <title>XML 101</title>
        <author>Bill Jones</author>
        <year>2000</year>
    </book>
    <book id=789>
        <title>This Book is Unrelated to XML</title>
        <author>Justin Tyme</author>
        <year>2006</year>
    </book>
</library>

我需要一种方法来检索元素，使用XPath或“pythonic”方法，如概述here，但我还需要能够操作文档，如下所示：

>>>xml.getElement('id=123').title="Intro to XML v2"
>>>xml.getElement('id=123').year="1998"

如果有人知道Python中的这样一个工具，请告诉我。谢谢！

Answer 1

如果您想避免安装lxml.etree，可以使用标准库中的xml.etree。

此处Acorn's answer移植到xml.etree：

import xml.etree.ElementTree as et  # was: import lxml.etree as et

xmltext = """
<root>
    <fruit>apple</fruit>
    <fruit>pear</fruit>
    <fruit>mango</fruit>
    <fruit>kiwi</fruit>
</root>
"""

tree = et.fromstring(xmltext)

for fruit in tree.findall('fruit'): # was: tree.xpath('//fruit')
    fruit.text = 'rotten %s' % (fruit.text,)

print et.tostring(tree) # removed argument: prettyprint

注意：如果我能以清晰的方式做到这一点，我会把它作为对Acorn答案的评论。如果您喜欢这个答案，请向Acorn投票。

Answer 2

lxml允许您使用XPath选择元素，并且还可以操作这些元素。

import lxml.etree as et

xmltext = """
<root>
    <fruit>apple</fruit>
    <fruit>pear</fruit>
    <fruit>mango</fruit>
    <fruit>kiwi</fruit>
</root>
"""

tree = et.fromstring(xmltext)

for fruit in tree.xpath('//fruit'):
    fruit.text = 'rotten %s' % (fruit.text,)

print et.tostring(tree, pretty_print=True)

<强>结果：

<root>
    <fruit>rotten apple</fruit>
    <fruit>rotten pear</fruit>
    <fruit>rotten mango</fruit>
    <fruit>rotten kiwi</fruit>
</root>

有没有一种简单的方法来在Python中操作XML文档？

2 个答案: