我正在尝试解析XML中的以下文本
title_text = word1 Word2 word3 word4
问题是,使用下面的代码我得到title_text = 'word1'
。
我怎样才能做到这一点?
XML:
<response>...<results>...<grouping>...<group>...
<doc>...
<title>
word1
<hlword>Word2</hlword>
<hlword>word3</hlword>
word4
</title>
...
</doc>
</group>...</grouping>...</results>...</response>...
解析代码:
from lxml import objectify
...
tree = objectify.fromstring(xml)
nodes = tree.response.results.grouping.group
for node in nodes:
title_element = node.doc.title
title_text = title_element.text
print title_text
答案 0 :(得分:1)
只需迭代.itertext()
:
>>> for node in nodes:
... print(' '.join(node.doc.title.itertext()))
...
word1 word2 word3 word4