如何在找到标记后获取文本
示例:
#!/usr/bin/env python
import lxml.html
html = """
<b>Point1:</b> Text1 <br>
<b>Point2:</b> Text2 <br>
...
<b>PointN:</b> TextN
<b>PointN+1:</b> TextN+1<br>
"""
dom = lxml.html.document_fromstring(html)
el = dom.xpath('//b[text()="PointN:"]')
print el
标签el用文本PointN找出如何获取文本TextN?
答案 0 :(得分:3)
由于TextN
跟随您已找到的<b>
,因此您可以使用XPath following
轴:
dom.xpath('//b[text() = "PointN:"]/following::node()')[0]
答案 1 :(得分:3)
另一种方式是:
el = dom.xpath('//b[text()="PointN:"]')[0]
print el.tail