使用 selenium 进行网络抓取时,我没有获得时间价值

时间:2021-03-08 22:53:59

标签: python web-scraping selenium-chromedriver

我有这个代码来抓取oddsportal页面:

https://www.oddsportal.com/soccer/england/premier-league/

browser = webdriver.Chrome()
browser.get("https://www.oddsportal.com/soccer/england/premier-league/")

df= pd.read_html(browser.page_source, header=0)[0]

timeList = []
dateList = []
gameList = []
home_odds = []
draw_odds = []
away_odds = []

for row in df.itertuples():
    if not isinstance(row[1], str):
        continue
    elif ':' not in row[1]:
        date = row[1].split('-')[0]
        continue
    time = timeList.append(row[1])
    dateList.append(date)
    gameList.append(row[2])
    home_odds.append(row[4])
    draw_odds.append(row[5])
    away_odds.append(row[6])

result = pd.DataFrame({'date':dateList,
                       'time':time,
                       'game':gameList,
                       'Home':home_odds,
                       'Draw':draw_odds,
                       'Away':away_odds})

我得到的输出为:

    date           time    game                             Home    Draw    Away
--  -------------  ------  -----------------------------  ------  ------  ------
 0  Today, 08 Mar          Chelsea - Everton                1.62    3.93    6.07
 1  Today, 08 Mar          West Ham - Leeds                 2.25    3.61    3.18
 2  10 Mar 2021            Manchester City - Southampton    1.22    6.94   13.75
 3  12 Mar 2021            Newcastle - Aston Villa          3.8     3.59    2
 4  13 Mar 2021            Leeds - Chelsea                  4.45    3.97    1.77
 5  13 Mar 2021            Crystal Palace - West Brom       2.1     3.34    3.77
 6  13 Mar 2021            Everton - Burnley                1.84    3.61    4.54
 7  13 Mar 2021            Fulham - Manchester City        10.05    5.16    1.34
 8  14 Mar 2021            Southampton - Brighton           2.8     3.11    2.77
 9  14 Mar 2021            Leicester - Sheffield Utd        1.5     4.34    7.06
10  14 Mar 2021            Arsenal - Tottenham              2.48    3.47    2.87
11  14 Mar 2021            Manchester Utd - West Ham        1.86    3.62    4.44
12  15 Mar 2021            Wolves - Liverpool               4.65    3.66    1.8
13  19 Mar 2021            Fulham - Leeds                   2.55    3.53    2.72
14  20 Mar 2021            Brighton - Newcastle             1.76    3.39    5.58
15  21 Mar 2021            West Ham - Arsenal               2.86    3.51    2.44
16  21 Mar 2021            Aston Villa - Tottenham          3.24    3.4     2.27

我没有从 time 中获得任何价值

Web Inspect

有人可以帮助我了解我是否遗漏了什么吗?我是否正确定义了 time

2 个答案:

答案 0 :(得分:1)

timeList.append(row[1]) 不返回任何内容,因此 time 始终为 None。我怀疑你想要:

    time = row[1]
    timeList.append(time)

答案 1 :(得分:1)

我认为这是一个简单的疏忽错误。您已将 time 函数的返回值分配给 list.append() 变量,即 None

所以,而不是:

time = timeList.append(row[1])

只需调用该函数,就像您对以下函数所做的那样:

timeList.append(row[1])