我想使用Python从网站http://www.footballlocks.com/nfl_odds.shtml中提取赔率信息。
我一直在尝试使用BeautifulSoup。
最佳结果是以字典或列表格式获取赔率信息,因为这些值将被输入数学公式。
赔率信息的HTML代码是:
<TABLE COLS="6" WIDTH="650" BORDER="0" CELLSPACING="5" CELLPADDING="2">
<TR>
<TD WIDTH="19%"><span title="Date and Time of Game."><B>Date & Time</B></span></TD>
<TD WIDTH="21%"><span title="Team Spotting Points in a Bet Against the Point Spread."><B>Favorite</B></span></TD>
<TD WIDTH="14%"><span title="Short for Point Spread. Number of Points Subtracted from Final Score of Favorite to Determine Winner of a Point Spread Based Wager."><B>Spread</B></span></TD>
<TD WIDTH="21%"><span title="Team Receiving Points in a Bet With the Point Spread."><B>Underdog</B></span></TD>
<TD WIDTH="6%"><span title="Line for Betting Over or Under the Total number of Points Scored by Both Teams Combined. Synonymous With Over/Under."><B>Total</B></span></TD>
<TD WIDTH="19%"><span title="Money odds to Win the Game Outright, Without any Point Spread.
Minus (-) is Amount Bettors Risk for Each $100 on the Favorite to Win the Game Outright.
Plus (+) is Amount Bettors Win for Each $100 Risked on the Underdog to Win the Game Outright."><B>Money Odds</B></span></TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At Detroit</TD>
<TD> -6</TD>
<TD>Tennessee</TD>
<TD>47</TD>
<TD>-$255 +$215</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At Houston</TD>
<TD> -2.5</TD>
<TD>Kansas City</TD>
<TD>43</TD>
<TD>-$140 +$120</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At New England</TD>
<TD> -6.5</TD>
<TD>Miami</TD>
<TD>42</TD>
<TD>-$290 +$240</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>Baltimore</TD>
<TD> -6.5</TD>
<TD>At Cleveland</TD>
<TD>42.5</TD>
<TD>-$300 +$250</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At Pittsburgh</TD>
<TD> -3.5</TD>
<TD>Cincinnati</TD>
<TD>48.5</TD>
<TD>-$180 +$160</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At Washington</TD>
<TD> -2.5</TD>
<TD>Dallas</TD>
<TD>45.5</TD>
<TD>-$145 +$125</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At NY Giants</TD>
<TD> -4.5</TD>
<TD>New Orleans</TD>
<TD>53.5</TD>
<TD>-$225 +$185</TD>
</TR>
<TR>
<TD>9/18 1:00 ET</TD>
<TD>At Carolina</TD>
<TD> -13.5</TD>
<TD>San Francisco</TD>
<TD>45</TD>
<TD>-$900 +$600</TD>
</TR>
<TR>
<TD>9/18 4:05 ET</TD>
<TD>At Arizona</TD>
<TD> -7</TD>
<TD>Tampa Bay</TD>
<TD>50</TD>
<TD>-$310 +$260</TD>
</TR>
<TR>
<TD>9/18 4:05 ET</TD>
<TD>Seattle</TD>
<TD> -6.5</TD>
<TD>At Los Angeles</TD>
<TD>38</TD>
<TD>-$290 +$240</TD>
</TR>
<TR>
<TD>9/18 4:25 ET</TD>
<TD>At Denver</TD>
<TD> -6.5</TD>
<TD>Indianapolis</TD>
<TD>46.5</TD>
<TD>-$280 +$240</TD>
</TR>
<TR>
<TD>9/18 4:25 ET</TD>
<TD>At Oakland</TD>
<TD> -4.5</TD>
<TD>Atlanta</TD>
<TD>49</TD>
<TD>-$210 +$180</TD>
</TR>
<TR>
<TD>9/18 4:25 ET</TD>
<TD>At San Diego</TD>
<TD> -3</TD>
<TD>Jacksonville</TD>
<TD>47</TD>
<TD>-$165 +$145</TD>
</TR>
<TR>
<TD>9/18 8:30 ET</TD>
<TD>Green Bay</TD>
<TD> -2.5</TD>
<TD>At Minnesota</TD>
<TD>43.5</TD>
<TD>-$140 +$120</TD>
</TR>
</TABLE>
到目前为止的Python代码。
from bs4 import BeautifulSoup
import urllib
url = "http://www.footballlocks.com/nfl_odds.shtml"
html = urllib.urlopen(url)
soup = BeautifulSoup(html, 'html.parser')
for record in soup.find_all('tr'):
for data in record.find_all('td'):
print data.text
PS。我的背景是经济学,我的编程经验有限。
答案 0 :(得分:1)
这不是最好的html解析,因为我们没有可以使用的类,但这会将所有行放入一个dicts列表中:
from bs4 import BeautifulSoup
import requests
url = "http://www.footballlocks.com/nfl_odds.shtml"
soup = BeautifulSoup(requests.get(url).content)
# Use the text of one of the headers to find the correct table
table = soup.find("span", text="Date & Time").find_previous("table")
data = []
# start from second tr
for row in table.select("tr + tr"):
# index to get the tds we need
tds = [td.text for td in row.find_all("td")]
fav, under, odds = tds[1], tds[2], tds[-1]
# split money odds into fav/under odds
f_odds,u_odds = odds.split()
data.append({fav: f_odds.replace(u"$", ""), under : u_odds.replace(u"$", "")})
from pprint import pprint as pp
pp(data)
输出:
[{u'At Detroit': u'-255', u'Tennessee': u'+215'},
{u'At Houston': u'-130', u'Kansas City': u'+110'},
{u'At New England': u'-290', u'Miami': u'+240'},
{u'At Cleveland': u'+225', u'Baltimore': u'-265'},
{u'At Pittsburgh': u'-175', u'Cincinnati': u'+155'},
{u'At Washington': u'-150', u'Dallas': u'+130'},
{u'At NY Giants': u'-215', u'New Orleans': u'+180'},
{u'At Carolina': u'-900', u'San Francisco': u'+600'},
{u'At Arizona': u'-330', u'Tampa Bay': u'+270'},
{u'At Los Angeles': u'+250', u'Seattle': u'-300'},
{u'At Denver': u'-275', u'Indianapolis': u'+235'},
{u'At Oakland': u'-210', u'Atlanta': u'+180'},
{u'At San Diego': u'-160', u'Jacksonville': u'+140'},
{u'At Minnesota': u'+115', u'Green Bay': u'-135'}]