尝试滚动和抓取动态加载的网页

时间:2018-10-07 00:46:56

标签: python python-3.x selenium selenium-webdriver beautifulsoup

我正在尝试抓取此网页上找到的每个游戏的所有可用赔率: https://www.sportsbookreview.com/betting-odds/nfl-football/?date=20170917

我知道该网页是动态加载的,因此我尝试插入一个滚动条,希望它在滚动时会加载所有可用赔率,但是不幸的是,由于它只是删除了先前加载的数据,因此情况似乎并非如此随着它继续滚动。

我尝试过实施类似的文章来解决这个问题(例如Trying to use Python and Selenium to scroll and scrape a webpage iteratively),但我似乎还是无法解决。下面粘贴的是我的代码。

import selenium
from selenium import webdriver

url= 'https://www.sportsbookreview.com/betting-odds/nfl-football/?date=20170917'
driver = webdriver.Chrome()
driver.get(url)
driver.execute_script("window.scrollTo(0, 900)") 

odds_finder=driver.find_elements_by_class_name('_3h0tU')

file_odds = []
for x in odds_finder:
    x=x.text
    file_odds.append(x)

driver.quit()

file_odds的输出粘贴在下面,但是正如您可以看到的,第一个非常简单的元素只是游戏的“开场白”和“下注”行,而不是其余可用赔率,如稍后在列表中所刮。任何帮助,将不胜感激。

['64%',
 '36%',
 'PK-110',
 'PK-110',
 '56%',
 '44%',
 '+7½-110',
 '-9+105',
 '58%',
 '42%',
 '+7-105',
 '-7-115',
 '66%',
 '34%',
 '-4½-110',
 '+4½-110',
 '49%',
 '51%',
 '-7-110',
 '+7-110',
 '45%',
 '55%',
 '+4½-110',
 '-4½-110',
 '49%',
 '51%',
 '+7½-130',
 '-7½+110',
 '+8½-104',
 '-8½-106',
 '+8½-105',
 '-8½-105',
 '+8-110',
 '-8-110',
 '+8½-110',
 '-8½-110',
 '+9-110',
 '-9-110',
 '+8½-105',
 '-8½-105',
 '53%',
 '47%',
 '+6-110',
 '-6-110',
 '+7-100',
 '-7-110',
 '+7-105',
 '-7-105',
 '+7-119',
 '-7-101',
 '+7-110',
 '-7-110',
 '+7-110',
 '-7-110',
 '+6½+105',
 '-6½-115',
 '49%',
 '51%',
 '+4-110',
 '-4-110',
 '+3½-105',
 '-3½-105',
 '+3½-105',
 '-3½-105',
 '+3½-110',
 '-3½-110',
 '+3½-110',
 '-3½-110',
 '+3½-110',
 '-3½-110',
 '+3½-110',
 '-3½+100',
 '37%',
 '63%',
 '+14½-120',
 '-14½+100',
 '+14-100',
 '-14-110',
 '+14-105',
 '-14-105',
 '+14-114',
 '-14-106',
 '+14-110',
 '-14-110',
 '+14+100',
 '-14-120',
 '+13½+105',
 '-13½-115',
 '53%',
 '47%',
 '+3-120',
 '-3+100',
 '+3-106',
 '-3-104',
 '+3-110',
 '-3+100',
 '+3-112',
 '-3-108',
 '+3-110',
 '-3-110',
 '+3-105',
 '-3-115',
 '+3-105',
 '-3-105',
 '60%',
 '40%',
 '-1-120',
 '+1+100',
 '-2½-100',
 '+2½-110',
 '-2½-103',
 '+2½-107',
 '-2½-105',
 '+2½-115',
 '-2½-118',
 '+2½-102',
 '-3-105',
 '+3-115',
 '-2½-105',
 '+2½-105',
 '41%',
 '59%',
 '+14-130',
 '-14+110',
 '+13½-110',
 '-13½-100',
 '+13½-108',
 '-13½-102',
 '+13½-115',
 '-13½-105',
 '+13½-110',
 '-13½-110',
 '+14-105',
 '-14-115',
 '+13½-105',
 '-13½-105',
 '51%',
 '49%',
 '+2½+100',
 '-2½-120',
 '+3½-110',
 '-3½-100',
 '+3+108',
 '-3-118',
 '+3+105',
 '-3-125',
 '+3+110',
 '-3-130',
 '+3-105',
 '-3-115',
 '+3+110',
 '-3-120']

1 个答案:

答案 0 :(得分:0)

尝试:

  

// div [包含(@class,'_3A-gC')] // section // div [starts-with(@class,   '_3h0tU')]