无法通过网页抓取(Selenium)单击下一页按钮

时间:2021-07-30 21:31:10

标签: selenium selenium-webdriver web-scraping

我正在尝试抓取此页面 https://www.bumeran.com.pe/empleos-publicacion-menor-a-7-dias.html。但是,我无法单击下一页按钮来抓取所有页面。我试过这样做:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

options = webdriver.ChromeOptions()
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

PATH = "C:\Program Files (x86)\chromedriver.exe"
wd = webdriver.Chrome(PATH, options=options)
wd.maximize_window()

wd.get("https://www.bumeran.com.pe/empleos-publicacion-menor-a-7-dias.html")

def pag_sig():
    element_present = EC.visibility_of_element_located((By.XPATH, '//*[@class="Pagination__NextPage-sc-el3mid-4 gSlsBf"]/i'))
    boton = WebDriverWait(wd,5).until(element_present)
    boton.click()

page_max=127 #I get this with another piece of code
for i in range(page_max):
    pag_sig()

然而,这不能正常工作。有时我的抓取工具会点击其他链接而不是按钮本身。我试过添加等待,但它不起作用。我该如何解决这个问题?

2 个答案:

答案 0 :(得分:1)

您可以点击按钮而不是直接点击下面的i,我希望

  button = WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH,"//button[@class='Pagination__NextPage-sc-el3mid-4 gSlsBf']")))

    button.click()

答案 1 :(得分:1)

可能是因为没有向下滚动。试试下面的 xpath 和代码一次。

#xpath - //div[@id='listado-avisos']//button[@class='Pagination__NextPage-sc-el3mid-4 gSlsBf']

driver.get("https://www.bumeran.com.pe/empleos-publicacion-menor-a-7-dias.html")
for i in range (5):
    nextoption = driver.find_element_by_xpath("//div[@id='listado-avisos']//button[@class='Pagination__NextPage-sc-el3mid-4 gSlsBf']")
    driver.execute_script("arguments[0].scrollIntoView(true);", nextoption)
    driver.execute_script("window.scrollBy(0,-300)")
    nextoption.click()
    time.sleep(3)