使用Selenium进行Python Web抓取-“ onclick”下载

时间:2018-11-08 20:38:30

标签: python selenium web-scraping

我正在尝试编写一个使用硒下载包含不同NHL播放器信息的文件的脚本。我想下载一组不同日期的文件。 URL结尾是日期,例如: https://www.fantasycruncher.com/lineup-rewind/draftkings/NHL/2018-10-29

此外,还有一个下拉菜单来选择每页显示的行数。 因此,我创建了一个循环来遍历日期集并在一页上显示所有行。

最后,有一个称为“动作”的下拉菜单,其中一个选项是:下载播放器列表。因此,我想在循环中单击该选项,该选项将下载CSV文件。

这是我当前的代码:

from selenium import webdriver 
from selenium.webdriver.support.ui import Select
from datetime import date, timedelta 

chromedriver = 
("C:/Users/Michel/Desktop/python/package/chromedriver_win32/chromedriver.exe")
driver = webdriver.Chrome(chromedriver)


DFS = []
calendar= []
calendar.append("2018-10-30")
calendar.append("2018-10-31")
for d in calendar:
    driver.get("https://www.fantasycruncher.com/lineup-rewind/draftkings/NHL/"+ d)
    select = Select(driver.find_element_by_name('ff_length'))
    select.select_by_value("-1")
driver.close()

我正在尝试选择“ -1”后产生点击。这是“下载播放器列表”选项的源代码

 <div class="table-actions-option" data-action="downloadPlayerlist" onclick="return true;">Download Player List</div> 

如何生成点击以下载列表?

然后,我打算访问C:\ Users \ Downloads中的下载文件。有可能还是我需要添加一些代码行?

谢谢

1 个答案:

答案 0 :(得分:0)

您需要显示元素Download Player List,但是存在延迟,因此显示时需要等待。

from selenium import webdriver 
from selenium.webdriver.support.ui import Select
from datetime import date, timedelta 

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

chromedriver = ("C:/Users/Michel/Desktop/python/package/chromedriver_win32/chromedriver.exe")
options = webdriver.ChromeOptions() 
options.add_argument("download.default_directory=C:/Users/Downloads")
# or
# prefs = {'download.default_directory' : 'C:/Users/Downloads'}
# options.add_experimental_option('prefs', prefs)

driver = webdriver.Chrome(chromedriver, chrome_options=options)


DFS = []
calendar= []
calendar.append("2018-10-30")
calendar.append("2018-10-31")
for d in calendar:
    driver.get("https://www.fantasycruncher.com/lineup-rewind/draftkings/NHL/"+ d)
    closeButton = driver.find_element_by_class_name('close-login-alert')
    closeButton.click()
    select = Select(driver.find_element_by_name('ff_length'))
    select.select_by_value("-1")
    actions = driver.find_element_by_id('table-actions')
    actions.click()
    WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH , '//div[@data-action="downloadPlayerlist"]')))
    downloadPlayerlist = driver.find_element_by_xpath('//div[@data-action="downloadPlayerlist"]')
    downloadPlayerlist.click()

# remove the comment below to close the browser
#driver.close()