我需要从URL获取一些数据,但这样做的唯一方法是通过下载。下面的代码适用于特定网站的某些实例,但有时它会打开浏览器,导航到网站,然后什么都没有。我已经尝试了WebDriverWait的各种实例,但似乎并不重要。我希望有人可以帮助我找出问题所在,因为我已经迷失了。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
bs_url = "https://baseballsavant.mlb.com/statcast_search?hfPT=&hfAB=&hfBBT=&hfPR=&hfZ=&stadium=&hfBBL=&hfNewZones=&hfGT=R%7C&hfC=&hfSea=2016%7C&hfSit=&player_type=batter&hfOuts=&opponent=&pitcher_throws=L&batter_stands=&hfSA=&game_date_gt=&game_date_lt=&team=&position=&hfRO=&home_road=&hfFlag=&metric_1=&hfInn=&min_pitches=0&min_results=0&group_by=name&sort_col=pitches&player_event_sort=h_launch_speed&sort_order=desc&min_abs=0#results"
driver = webdriver.Chrome()
driver.wait = WebDriverWait(driver, 5)
driver.get(bs_url)
driver.wait = WebDriverWait(driver, 5)
Stats = driver.find_element_by_id("table_all_pid_").click()
driver.wait = WebDriverWait(driver, 5)
driver.quit()
HTML:
>% of Pitches</th>
<th colspan="1"></th>
<th title="Create Chart Comparison" class="table-icon visual" id="compare_all_pid_"><img src="site-core/images/chart_curve.png" /></th>
<th title="Download Results Comma Separated Values File" class="table-icon csv_table" id="table_all_pid_"><img src="site-core/images/disk.png" /></th>
<th title="Download Data as Comma Separated Values File" class="table-icon csv" id="csv_all_pid_"><img src="site-core/images/database_link.png" /></th>
</tr>
</thead>
答案 0 :(得分:0)
要点击标题为下载结果逗号分隔值文件的元素,您可以使用以下代码块:
css_selector
:
driver.find_element_by_css_selector("th.table-icon.csv_table#table_all_pid_[title='Download Results Comma Separated Values File'] > img").click()
xpath
:
driver.find_element_by_xpath("//th[@class='table-icon csv_table' and @id='table_all_pid_' and @title='Download Data as Comma Separated Values File']/img").click()
答案 1 :(得分:-1)
您可以尝试代码:
bs_url = "https://baseballsavant.mlb.com/statcast_search?hfPT=&hfAB=&hfBBT=&hfPR=&hfZ=&stadium=&hfBBL=&hfNewZones=&hfGT=R%7C&hfC=&hfSea=2016%7C&hfSit=&player_type=batter&hfOuts=&opponent=&pitcher_throws=L&batter_stands=&hfSA=&game_date_gt=&game_date_lt=&team=&position=&hfRO=&home_road=&hfFlag=&metric_1=&hfInn=&min_pitches=0&min_results=0&group_by=name&sort_col=pitches&player_event_sort=h_launch_speed&sort_order=desc&min_abs=0#results"
driver = webdriver.Chrome()
driver.wait = WebDriverWait(driver, 50)
driver.get(bs_url)
WebDriverWait(driver,20).until(EC.presence_of_element_located((By.ID,"table_all_pid_")))
WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.ID,"table_all_pid_")))
Stats = driver.find_element_by_id("table_all_pid_").click()
通过检入下载的目录,您可以成功下载基于文件的断言。
试用此代码,让我知道状态。