想知道你是否可以帮助我尝试做一些网络报废。
下面是我想从中获取数据的span类。问题是跨度类中有一个随机数用于不同的数据点。
我知道“price-val”部分对于所有迭代都是相同的,但我无法弄清楚如何在获取数据时搜索这个。
<span class="price-val_196775436 odd-val ib right">
2.47
</span>
我的代码到目前为止
url ="http://www.sportsbet.com.au/betting/american-football"
r = requests.get(url)
soup = BeautifulSoup(r.content)
g_data = soup.find_all("div", {"class": "accordion-body"})
for item in g_data:
A = item.find('span', {'class': 'team-name ib'}).text
B = item.find('span', {'class': 'price-val_196775436 odd-val ib right'}).text
我得到的错误
Traceback (most recent call last):
File "C:\Users\James\Desktop\NFLsportsbet.py", line 23, in <module>
B = item.find('span', {'class': 'price-val'}).text
AttributeError: 'NoneType' object has no attribute 'text'
答案 0 :(得分:2)
使用解析器库,例如lxml
并且可能需要使用正则表达式或lambda -
import requests,re
from bs4 import BeautifulSoup
url ="http://www.sportsbet.com.au/betting/american-football"
r = requests.get(url)
soup = BeautifulSoup(r.content,'lxml')
g_data = soup.find_all("div", {"class": "accordion-body"})
for items in g_data:
print items.find('span', {'class': 'team-name ib'}).text
print items.find('span', {'class': lambda L: L and L.startswith('price-val_')}).text
#print items.find('span', {'class': re.compile('price-val_*')}).text #or regex like this
打印
Detroit Lions
2.47
Tampa Bay Buccaneers
3.85
Arizona Cardinals
1.39
San Diego Chargers
2.65
San Francisco 49ers
3.95
New York Giants
2.40
Cincinnati Bengals
1.97
Tennessee Titans
2.61
Minnesota Vikings
1.90
New York Jets
1.66
Seattle Seahawks
1.46
Green Bay Packers
1.68
Indianapolis Colts
3.22