Question

我使用的是xml.etree.ElementTree作为ET，这看起来像是首选的库，但如果有其他/更好的工作，我很感兴趣。

假设我有一棵树：

doc = """
<top>
<second>
<third>
    <subthird></subthird>
    <subthird2>
         <subsubthird>findme</subsubthird>
    </subthird2>
</third>
</second>
</top>"""

并且为了这个问题，让我们说这已经在一个名为myTree的元素中

我想将findme更新为found，除了迭代之外，还有一种简单的方法吗？

myTree.getroot().getchildren()[0].getchildren()[0].getchildren() \
    [1].getchildren()[0].text = 'found'

问题是我有一个大的xml树，我想更新这些值，我找不到一个明确和pythonic的方法来做到这一点。

Answer 1

您可以使用XPath expressions获取如下所示的特定标记名：

void ( *(*f[]) () ) ();        "f is"  
          ^  

void ( *(*f[]) () ) ();        "f is an array"  
           ^^ 

void ( *(*f[]) () ) ();        "f is an array of pointers" 
         ^    

void ( *(*f[]) () ) ();        "f is an array of pointers to function"   
               ^^     

void ( *(*f[]) () ) ();        "f is an array of pointers to function returning pointer"
       ^   

void ( *(*f[]) () ) ();        "f is an array of pointers to function returning pointer to function" 
                    ^^    

void ( *(*f[]) () ) ();        "f is an array of pointers to function returning pointer to function returning `void`"  
^^^^

如果您需要查找具有特定文字值的所有标记，请查看以下答案：Find element by text with XPath in ElementTree。

Answer 2

我将lxml与XPath表达式一起使用。 ElementTree有一个缩写的XPath语法，但由于我不使用它，我不知道它有多广泛。关于XPath的事情是你可以根据需要编写复杂的元素选择器。在这个例子中，它基于嵌套：

import lxml.etree 

doc = """
<top>
<second>
<third>
    <subthird></subthird>
    <subthird2>
         <subsubthird>findme</subsubthird>
    </subthird2>
</third>
</second>
</top>"""

root = lxml.etree.XML(doc)
for elem in root.xpath('second/third/subthird2/subsubthird'):
    elem.text = 'found'

print(lxml.etree.tostring(root, pretty_print=True, encoding='unicode'))

但是假设还有一些其他标识，例如唯一属性，

<subthird2 class="foo"><subsubthird>findme</subsubthird></subthird2>

那么xpath就是//subthird2[@class="foo"]/subsubthird。

用etree搜索整个树

2 个答案: