我目前正在使用python网络采集每个NBA球员的三点统计数据,并试图将这些数据放入数据框中。下面的代码是我尝试将值添加到数据框中的尝试。变量players,teams,threePointAttempts和ThreePointPercentage都是包含50个值的列表。在while循环的每次迭代之后,这些脚本都会重新填充,因为脚本会在NBA网站的每一页中移动。
while i<10:
soup = BeautifulSoup(d.page_source, 'html.parser').find('table')
headers, [_, *data] = [i.text for i in soup.find_all('th')], [[i.text for i in b.find_all('td')] for b in soup.find_all('tr')]
final_data = [i for i in data if len(i) > 1]
data_attrs = [dict(zip(headers, i)) for i in final_data]
print(data_attrs)
players = [i['PLAYER'] for i in data_attrs]
teams = [i['TEAM'] for i in data_attrs]
threePointAttempts = [i['3PA'] for i in data_attrs]
threePointPercentage = [i['3P%'] for i in data_attrs]
data_df = data_df.append(pd.DataFrame(players, columns=['Player']),ignore_index=True)
data_df = data_df.append(pd.DataFrame(teams, columns=['Team']),ignore_index=True)
data_df = data_df.append(pd.DataFrame(threePointAttempts, columns=['3PA']),ignore_index=True)
data_df = data_df.append(pd.DataFrame(threePointPercentage, columns=['3P%']),ignore_index=True)
data_df = data_df[['Player','Team','3PA','3P%']]
我遇到的问题是数据帧填充如下:
答案 0 :(得分:0)
尝试:
temp_df = pd.DataFrame({'Player': players,
'Team': teams,
'3PA': threePointAttempts,
'3P%': threePointPercentage})
data_df = data_df.append(temp_df, ignore_index=True)