我正在使用硒进入URL并单击搜索。这部分工作正常。搜索后,我要在页面上抓取与运动员相关的所有href URL,然后单击下一页。我已经尝试了多个类和xpath位置,但没有成功...
目标:
1) Go to the URL listed
2) Click search buttton
3) Scrape all the urls that go to each athletes profile page
4) Click the next page button at the bottom
5) Repeat this process through all the pages
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
TIMEOUT = 5
driver = webdriver.Firefox()
driver.set_page_load_timeout(TIMEOUT)
url = 'https://n.rivals.com/search#?formValues=%7B%22sport%22:%22Football%22,%22recruit_year%22:2021,%22offer_and_visit_type%22:%5B%22Offer%22%5D,%22prospect_profiles.prospect_colleges.offer%22:true,%22page_number%22:1,%22page_size%22:50%7D'
try:
driver.get(url)
except TimeoutException:
pass
#this click method works
search_button = driver.find_element_by_xpath('//*[@id="articles"]/div/div[2]/div/div/div[1]/form/div[2]/div[5]/button')
search_button.click();
#I cannot find/get the href links below to print:
profile_page = driver.find_elements_by_xpath('//*[@id="content_"]/td[1]/div[2]/div/a')
profile_page = [home.get_attribute("href") for home in profile_page]
print(profile_page)
#I cannot get it to click the next button to do the same thing on the next page:
next_button = driver.find_element_by_xpath('//*[@id="content_"]/td[1]/div[2]/div/a')
next_button.click();