Python和Web抓取-CSV输出问题

时间:2018-12-21 14:35:42

标签: python web web-scraping screen-scraping

我目前正在尝试运行Python脚本以从Yahoo Fantasy Football网站提取一些数据。我已经能够成功抓取数据,但是CSV输出遇到问题。所有数据都被放在一列而不是多个不同的列中。以下是我正在使用的代码:

import re, time, csv
import requests
from bs4 import BeautifulSoup

#Variables
League_ID = 1459285
Week_Number = 1
Start_Week = 1
End_Week = 13
Team_Name = "Test"
Outfile = 'Team_Stats.csv'
Fields = ['Player Name', 'Player Points', 'Player Team', 'Week']


with open('Team_Stats.csv','w') as Team_Stats:
        f = csv.writer(Team_Stats, Fields, delimiter=',', lineterminator='\n')
        f.writerow(Fields)


    for Week_Number in range(Start_Week, End_Week + 1):
            url = requests.get("https://football.fantasysports.yahoo.com/f1/" + str(League_ID) + "/2/team?&week=" + str(Week_Number))
            soup = BeautifulSoup(url.text, "html.parser")
            #print("Player Stats for " + Team_Name + " for Week " + str(Week_Number))

            player_name=soup.find_all('div',{'class':'ysf-player-name'})
            player_points=soup.find_all('a',{'class':'pps Fw-b has-stat-note '})

            for player_name in player_name:
                    player_name = player_name.contents[0]
                    #print(div.text)
                    f.writerow(player_name)

            for player_points in player_points:
                    #print(div.text)
                    Week_Number += 1
                    f.writerow(player_points)

    Team_Stats.flush()
    Team_Stats.close()
    print("Process Complete")

我还想在代码中保留一些空间以添加更多的“ For循环”,因为我还有其他要收集的数据。

如果有人可以提出一种更好的方式来构建我的代码,请随时提供帮助!

这是我在csv文件中得到的示例输出

screen shot

谢谢

1 个答案:

答案 0 :(得分:0)

compareHist()

1)抓取玩家名的类错误

2)我使用import re, time, csv import requests from bs4 import BeautifulSoup #Variables League_ID = 1459285 Week_Number = 1 Start_Week = 1 End_Week = 13 Team_Name = "Test" Outfile = 'Team_Stats.csv' Fields = ['Player Name', 'Player Points', 'Player Team', 'Week'] with open('Team_Stats.csv','w') as Team_Stats: f = csv.writer(Team_Stats, Fields, delimiter=',', lineterminator='\n') f.writerow(Fields) for Week_Number in range(Start_Week, End_Week + 1): row = [] url = requests.get("https://football.fantasysports.yahoo.com/f1/" + str(League_ID) + "/2/team?&week=" + str(Week_Number)) soup = BeautifulSoup(url.text, "html.parser") #print("Player Stats for " + Team_Name + " for Week " + str(Week_Number)) player_name=soup.find_all('a',{'class':'Nowrap name F-link'}) player_points=soup.find_all('a',{'class':'pps Fw-b has-stat-note '}) for pn, pp in zip(player_name, player_points): player_name = pn.contents[0] player_points = pp.contents[0] f.writerow([player_name, player_points]) Team_Stats.flush() Team_Stats.close() print("Process Complete") 一次遍历两个列表,以构造具有名称和点的行