Question

我正在解析的html文件具有多个completeExceptionally(new TimeoutException())标签，如下所示：

<p>

在此打印第一段文字：first text ... ... ... ... my text

first text

如何打印最后一个：print (soup.find("section", {"id": "posts"}).article.div.p.text)

Answer 1

使用find_all获取所有p作为列表，获取最后一个元素，然后引用其text属性

soup.find("section", {"id": "posts"}).article.div.find_all('p')[-1].text

Answer 2

可以使用find_next_siblings方法解决该问题：

例如提取第四个标签

l1 = soup.find("section", {"id": "posts"}).article.div.p
l2 = l1.find_next_sibling('p')
l2 = l2.find_next_sibling('p')
l2 = l2.find_next_sibling('p')

print (l2.text)

如何在beautifulsoup中提取最后一个段落标签文本？

2 个答案: