In the code below, I have successfully scraped a list of every MLB team and their corresponding win probability for the day (April 18th). I would like to export this data to a CSV file, but when I write the code, only one team and win probability gets exported. Does anyone know why this happens? I'm thinking there needs to be another for loop written with the CSV writer but I am not exactly sure how to do that with two separate sources of scraped data (team name and win probability) Thanks in advance!
import requests
import csv
from bs4 import BeautifulSoup
page=requests.get('https://www.fangraphs.com/livescoreboard.aspx?date=2018-
04-18')
soup=BeautifulSoup(page.text, 'html.parser')
[link.decompose() for link in soup.find_all(class_='lineup')]
f=csv.writer(open('Win_Probability.csv','w'))
f.writerow(['Teams','Win_Prob'])
team_name_list=soup.find(class_='RadAjaxPanel')
team_name_list_items=team_name_list.find_all('a')
for team_name in team_name_list_items:
teams=team_name.contents[0]
print(teams)
winprob_list=soup.find(class_='RadAjaxPanel')
winprob_list_items=winprob_list.find_all('td',attrs={'style':'border:1px
solid black;'})
for winprob in winprob_list_items:
winprobperc=winprob.contents[0]
print(winprobperc)
f.writerow([teams,winprobperc])
答案 0 :(得分:1)
f.writerow([teams,winprobperc])
is not in a loop. So this code only runs once, writing a single team and win probability. You need to loop through all teams and writerow for each.
答案 1 :(得分:0)
I think you're overwriting the teams
and winprobperc
variables when you loop over them.
You can try using list comprehension like this:
teams = [team.contents[0] for team in team_name_list_items]
winprobperc = [prob.contents[0] for prob in winprob_list_items]
This generates a list of all the items in each list, properly fetching the exact string you need from the elements.
Assuming these arrays are of equal length, you can then write them as rows to the csv:
for i in xrange(len(teams)):
f.writerow([teams[i], winprobperc[i]])
Depending on your case, it might be beneficial to generate the complete data table first before adding them as rows. To accomplish this, you can generate a 2d array containing all the rows based also on the length of one array:
data = [[teams[i], winprobperc[i]] for i in xrange(len(teams))]
f.writerows(data)