来自以下xml结构并使用ElementTree我试图解析描述'text ,用于标题文本包含某个感兴趣关键字的项目。谢谢你的任何建议
<data>
<item>
<title>contains KEYWORD of interest </title>
<description> description text of interest "1"</description>
</item>
<item>
<title>title text </title>
<description> description text not of interest</description>
</item>
.
.
.
<item>
<title>also contains KEYWORD of interest </title>
<description> description text of interest "k" </description>
</item>
</data>
期望的结果:
感兴趣的描述文字“1”
感兴趣的描述文字“k”
答案 0 :(得分:1)
xml = '''<data>
<item>
<title>contains KEYWORD of interest </title>
<description> description text of interest "1"</description>
</item>
<item>
<title>title text </title>
<description> description text not of interest</description>
</item>
.
.
.
<item>
<title>also contains KEYWORD of interest </title>
<description> description text of interest "k" </description>
</item>
</data>
'''
import lxml.etree
root = lxml.etree.fromstring(xml)
root.xpath('.//title[contains(text(), "KEYWORD")]/'
'following-sibling::description/text()')
# => [' description text of interest "1"', ' description text of interest "k" ']
import xml.etree.ElementTree as ET
root = ET.fromstring(xml)
[item.find('description').text for item in root.iter('item')
if'KEYWORD' in item.find('title').text]
# => [' description text of interest "1"', ' description text of interest "k" ']