Question

使用lxml，我希望能够获取HTML元素并将其转换为字符串，不包括其子元素。我该怎么做？

由于

Answer 1

您可以使用remove方法删除子项：

import lxml.html as LH

code = '''<a foo="bar">some text<b></b> here <c><d>Hi</d></c> and here</a>'''

root = LH.fromstring(code)
print(root.text_content())
# some text here Hi and here

for elt in root:
    root.remove(elt)

print(LH.tostring(root))

产量

<a foo="bar">some text</a>

但请注意，text_content返回的所有文本都不会保留你把孩子搬走了。

lxml - 如何从子元素中隔离元素

1 个答案: