你好,我是新手,
但是我写了下面的脚本来抓住以下排名 http://i.stack.imgur.com/98FPr.png
网站:http://www.bbc.com/sport/football/spanish-la-liga/table
我试图打印位置和团队名称。团队名称打印很好,但对于这个位置,我一直没有。有人可以帮我解决这个问题吗?
import urllib2
from bs4 import BeautifulSoup
url = "http://www.bbc.com/sport/football/spanish-la-liga/table"
soup = BeautifulSoup(urllib2.urlopen(url).read())
for row in soup ("table" , {"class" : "table-stats"})[0].tbody("tr"):
tds = row("td")
print tds[1].string, tds[2].string
答案 0 :(得分:0)
您可以通过交互式Python shell轻松找到问题。
问题在于tds[1]
(或完整,soup("table", {"class":"table-stats"})[0].tbody("tr")[0]("td")[1]
,只是为了命名一个)具有
<td class="position"><span class="no-movement">No movement</span> <span class="position-number">1</span></td>
其.string
属性(即.string
类的position
属性)为None
。您可以使用tds[1].contents[2].string
提取实际数字。
完全更正的脚本:
#!/usr/bin/env python
import urllib2
from bs4 import BeautifulSoup
url = "http://www.bbc.com/sport/football/spanish-la-liga/table"
soup = BeautifulSoup(urllib2.urlopen(url).read())
for row in soup ("table" , {"class" : "table-stats"})[0].tbody("tr"):
tds = row("td")
print tds[1].contents[2].string, tds[2].string
输出:
1 Barcelona
2 Real Madrid
3 Valencia
4 Atl Madrid
5 Sevilla
6 Villarreal
7 Málaga
8 Ath Bilbao
9 Espanyol
10 Real Sociedad
11 Celta de Vigo
12 Rayo Vallecano
13 Getafe
14 Eibar
15 Elche
16 Almería
17 Deportivo de La Coruña
18 Levante
19 Granada CF
20 Córdoba