Question

我正在使用美丽的汤。

有没有办法根据评论旁边的位置（解析树中没有包含的内容）来获取标签？

例如，假设我有......

<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--text-->
<p>paragraph 3</p>
</body>
</html>

在此示例中，如果我正在搜索评论“<p>paragraph 2</p>”，我如何识别？

感谢您的帮助。

Answer 1

注释与任何其他节点一样出现在BeautifulSoup解析树中。例如，要查找包含文本some comment text的评论，然后打印出您可以执行的上一个<p>元素：

from BeautifulSoup import BeautifulSoup, Comment

soup = BeautifulSoup('''<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--some comment text-->
<p>paragraph 3</p>
</body>
</html>''')

def right_comment(e):
    return isinstance(e, Comment) and e == 'some comment text'

e = soup.find(text=right_comment)

print e.findPreviousSibling('p')

...将打印出来：

<p>paragraph 2</p>

美丽的汤 - 根据评论旁边的位置识别标签

1 个答案: