Writing and Saving CSV File From Scraping Data

时间:2018-04-20 00:43:20

标签: python csv beautifulsoup

In the code below, I have successfully scraped a list of every MLB team and their corresponding win probability for the day (April 18th). I would like to export this data to a CSV file, but when I write the code, only one team and win probability gets exported. Does anyone know why this happens? I'm thinking there needs to be another for loop written with the CSV writer but I am not exactly sure how to do that with two separate sources of scraped data (team name and win probability) Thanks in advance!

import requests
import csv
from bs4 import BeautifulSoup

page=requests.get('https://www.fangraphs.com/livescoreboard.aspx?date=2018- 
04-18')
soup=BeautifulSoup(page.text, 'html.parser')


[link.decompose() for link in soup.find_all(class_='lineup')]

f=csv.writer(open('Win_Probability.csv','w'))
f.writerow(['Teams','Win_Prob'])

team_name_list=soup.find(class_='RadAjaxPanel')
team_name_list_items=team_name_list.find_all('a')


for team_name in team_name_list_items:
  teams=team_name.contents[0]
  print(teams)

winprob_list=soup.find(class_='RadAjaxPanel')
winprob_list_items=winprob_list.find_all('td',attrs={'style':'border:1px 
solid black;'})

for winprob in winprob_list_items:
  winprobperc=winprob.contents[0]
  print(winprobperc)


f.writerow([teams,winprobperc])

2 个答案:

答案 0 :(得分:1)

f.writerow([teams,winprobperc])

is not in a loop. So this code only runs once, writing a single team and win probability. You need to loop through all teams and writerow for each.

答案 1 :(得分:0)

I think you're overwriting the teams and winprobperc variables when you loop over them.

You can try using list comprehension like this:

teams = [team.contents[0] for team in team_name_list_items]
winprobperc = [prob.contents[0] for prob in winprob_list_items]

This generates a list of all the items in each list, properly fetching the exact string you need from the elements.

Assuming these arrays are of equal length, you can then write them as rows to the csv:

for i in xrange(len(teams)):
    f.writerow([teams[i], winprobperc[i]])

Depending on your case, it might be beneficial to generate the complete data table first before adding them as rows. To accomplish this, you can generate a 2d array containing all the rows based also on the length of one array:

data = [[teams[i], winprobperc[i]] for i in xrange(len(teams))]
f.writerows(data)