我有以下代码:
players = ['a','b','c',etc]
list = []
for player in players:
html = 'https://hoopshype.com/player/'+player+'/salary/'
webpage = requests.get(html)
content = webpage.content
soup = BeautifulSoup(content,"html.parser")
table = soup.find('table',{'class':'player-payroll-1'})
for row in table.find_all('tr'):
for item in row.find_all('td',{'class':'table-value'}):
a = item.text
c = a.replace("\n","").replace("\t","")
b.append(c)
我正尝试遍历大量玩家。现在,我知道我的代码是正确的,因为我已经与一些成功的玩家进行了专门的检查。
但是当我尝试遍历整个列表时,for循环停止并且出现错误:'NoneType'对象没有属性'find_all'
我正在寻找如何执行for循环至:
a)准确找出列表中的哪些项目导致了错误,并且 b)尽管有错误,仍继续遍历列表
有没有办法做到这一点?
答案 0 :(得分:2)
要解决错误,可以使用try
和except
。但是似乎不必为了获得初始表中的薪水而遍历每个球员页面。您需要致电EACH播放器页面的任何特殊原因吗?
这里是整张桌子:
import pandas as pd
df = pd.read_html('https://hoopshype.com/salaries/players/')[0]
前10行的输出:
print (df.head(10).to_string())
Unnamed: 0 Player 2019/20 2020/21 2021/22 2022/23 2023/24 2024/25
0 1.0 Stephen Curry $40,231,758 $43,006,362 $45,780,966 $0 $0 $0
1 2.0 Russell Westbrook $38,506,482 $41,358,814 $44,211,146 $47,063,478 $0 $0
2 2.0 Chris Paul $38,506,482 $41,358,814 $44,211,146 $0 $0 $0
3 4.0 James Harden $38,199,000 $41,254,920 $44,310,840 $47,366,760 $0 $0
4 4.0 John Wall $38,199,000 $41,254,920 $44,310,840 $47,366,760 $0 $0
5 6.0 LeBron James $37,436,858 $39,219,566 $41,002,274 $0 $0 $0
6 7.0 Kevin Durant $37,199,000 $39,058,950 $40,918,900 $42,778,850 $0 $0
7 8.0 Blake Griffin $34,449,964 $36,810,996 $38,957,028 $0 $0 $0
8 9.0 Kyle Lowry $33,296,296 $30,500,000 $0 $0 $0 $0
9 10.0 Paul George $33,005,556 $35,450,412 $37,895,268 $0 $0 $0