尝试剥离标记时,etree.strip_tags返回“无”

时间:2014-02-06 17:43:38

标签: python lxml

脚本:

print entryDetails

for i in range(len(entryDetails)):
    print etree.tostring(entryDetails[i])

    print etree.strip_tags(entryDetails[i], 'entry-details')

输出:

[<Element entry-details at 0x234e0a8>, <Element entry-details at 0x234e878>]
<entry-details>2014-02-05 11:57:01</entry-details>
None
<entry-details>2014-02-05 12:11:05</entry-details>
None

etree.strip_tags如何无法删除条目详细信息标记? 标签名称中的破折号是否会影响它?

1 个答案:

答案 0 :(得分:1)

strip_tags()不会返回任何内容。它就地剥离了标签。

documentation说:“请注意,这不会删除您传递的元素(或ElementTree根元素),即使它匹配。它只会处理它的后代。”。

演示代码:

from lxml import etree

XML = """
<root>
 <entry-details>ABC</entry-details>
</root>"""

root = etree.fromstring(XML)
ed = root.xpath("//entry-details")[0]
print ed
print

etree.strip_tags(ed, "entry-details")       # Has no effect 
print etree.tostring(root)
print

etree.strip_tags(root, "entry-details")     
print etree.tostring(root)

输出:

<Element entry-details at 0x2123b98>

<root>
 <entry-details>ABC</entry-details>
</root>

<root>
 ABC
</root>