我有以下.xml文件,我喜欢操作:
<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA />
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>
我想更改位于<DATA>
元素下的<NUMBER>7721</NUMBER>
内的文字。我怎么做?如果我使用find()
或findtext()
,则只能指向第一个匹配。
答案 0 :(得分:3)
XPath非常适合这种东西。 //TYPE[NUMBER='7721' and DATA]
将找到所有TYPE节点,这些节点至少有一个NUMBER子节点,文本为“7721”,并且至少有一个DATA子节点:
from lxml import etree
xmlstr = """<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA />
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>"""
html_element = etree.fromstring(xmlstr)
# find all the TYPE nodes that have NUMBER=7721 and DATA nodes
type_nodes = html_element.xpath("//TYPE[NUMBER='7721' and DATA]")
# the for loop is probably superfluous, but who knows, there might be more than one!
for t in type_nodes:
d = t.find('DATA')
# example: append spamandeggs to the end of the data text
if d.text is None:
d.text = 'spamandeggs'
else:
d.text += 'spamandeggs'
print etree.tostring(html_element)
输出:
<html>
<A>
<B>
<C>
<D>
<TYPE>
<NUMBER>7297</NUMBER>
<DATA/>
</TYPE>
<TYPE>
<NUMBER>7721</NUMBER>
<DATA>A=1,B=2,C=3,spamandeggs</DATA>
</TYPE>
</D>
</C>
</B>
</A>
</html>