我想将两个段落中的所有内容合并为一个段落,它们之间有一个空格。我怎么能用lxml做到这一点?
示例:
<p>He is <b>bold</b>!</p>
<p>Is he <u>here</u>?</p>
将合并到:
<p>He is <b>bold</b>! Is he <u>here</u>?</p>
答案 0 :(得分:0)
如果您的结构很简单,可能会有这个诀窍:
import lxml
from lxml import etree
root = etree.fromstring("<root></root>")
first = etree.fromstring("<p>He is <b>bold</b>!</p>")
second = etree.fromstring("<p>Is he <u>here</u>?</p>")
try:
first.getchildren()[-1].tail += ' ' + second.text
except IndexError:
first.text += ' ' + second.text
root.append(first)
for child in second.getchildren():
root.append(child)
etree.tostring(root)