Python lxml并解析子树

时间:2013-10-08 14:53:11

标签: python lxml

我有一个看起来像这样的xml:

<root>
<foo>
<a></a>
<b></b>
<c></c>
</foo>
<bars>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
</bars>
<root>

我想从所有栏中提取有趣的文字。 你能告诉我怎么开始吗? 我试过用

bars = etree.iterparse(xml_data, tag="bars")

但我无法迭代它。

1 个答案:

答案 0 :(得分:0)

使用findall方法返回所有匹配的元素。

xml_data = '''<?xml version='1.0' encoding='ASCII' ?>
<root>
<foo>
<a></a>
<b></b>
<c></c>
</foo>
<bars>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
<bar>
<one>interesting</one>
<two>interesting</two>
<three>interesting</three>
</bar>
</bars>
</root>
'''

from lxml import etree

root = etree.fromstring(xml_data)
for bars in root.findall('.//bars'):
    print(etree.tostring(bars, method='text'))