Question

我试图以我能想到的各种方式寻找这个，但没有运气。如果有人以不同的方式回答这个问题，我会提前道歉。

我需要帮助完成：

我有一组名称值，如：['John Smith'，'New York'，'Toys']我知道它们存在于XML文档中，如：

<doc>
    <people>
        <name>John Smith</name>
    </people>
    <places>
        <name>New York</name>
    </places>
    <things>
        <name>Toys</name>
    </things>
    <about>
       <name>John Smith is male.</name>
    </about>
</doc>

使用elementtree，我可以遍历列表并在文档中找到这些值。

我正在试图弄清楚如何处理标题：

循环浏览列表并在文档中找到值
找出每个值周围的XML标记，并返回标记名称

我无法弄清楚这一点，但我认为必须有一种方法可以在没有太多繁重的情况下实现这一目标。任何建议或建议将不胜感激。

Answer 1

如果你不依赖于elementtree，这是一个使用lxml的简单例子（注意：我没有为你做循环，你可以做这部分工作）。但它会为您提供包含文本的标记，然后是该标记的父标记：

#!/usr/bin/env python3

from lxml import etree

lines = None
with open('ex.xml') as f:
    lines = f.read()

doc = etree.fromstring(lines)
elem = doc.xpath("//name[text()='John Smith']")

for e in elem:
    parent = e.getparent()
    print(parent.tag)

Answer 2

使用re.findall

re.findall('<name>(.*)</name>', string )

使用python确定值所在的元素

2 个答案: