美丽的汤抓下一个元素

时间:2016-11-09 01:43:21

标签: python beautifulsoup python-requests

我试图拉动选举选举票来检查它何时更新。但困难的是每次刷新都会改变所有类。我想搜索文本Trump,然后找到下一个计数元素。

我可以通过搜索字符串Trump找到该元素:

import requests
import re
from bs4 import BeautifulSoup
url = "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=who+is+winning+the+presidential+election&eob=enn/p//1/0///////////"
r = requests.get(url)
soup = BeautifulSoup(r.content)
elm = soup.find(text='Trump')
print elm.text

我找到了lm = soup.find(text='Trump')的特朗普元素,但我不知道如何在那之后抓住下一个元素。

1 个答案:

答案 0 :(得分:3)

您当前的代码正在寻找节点与该文本的完全匹配。试试这个:

soup.body.findAll(text=re.compile('Trump'))
> ["Donald Trump is US president-elect in 'America's Brexit' as Hillary Clinton concedes election - live", 'Donald Trump ', 'Donald Trump wins presidential election, plunging US into uncertain future'... ]

您将寻找包含目标文本的正则表达式。您可以优化您正在寻找的正则表达式,例如:

b.body.findAll(text=re.compile('Trump wins .+? uncertain future'))
> ['Donald Trump wins presidential election, plunging US into uncertain future']