Selenium 无法点击下一步按钮

时间:2021-06-10 23:28:00

标签: python selenium selenium-webdriver web-scraping xpath

我已经在这工作了几个小时,但没有取得任何进展。我正在尝试点击此页面上的下一个按钮 here

这是我的代码:

#!/usr/local/bin python3

import sys
import time
import re
import logging
from selenium import webdriver
from selenium.webdriver.firefox.options import Options as options
from bs4 import BeautifulSoup as bs
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.action_chains import ActionChains


_USE_VIRTUAL_DISPLAY = False
_FORMAT = '%(asctime)s - %(levelname)s - %(name)s - %(message)s'
# logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG)
logging.basicConfig(format=_FORMAT, level=logging.INFO)
_LOGGER = logging.getLogger(sys.argv[0])
_DEFAULT_SLEEP = 0.5


try:
    options = options()
    # options.headless = True
    

    driver = webdriver.Firefox(options=options, executable_path=r"/usr/local/bin/geckodriver")
    
    print("Started Browser and Driver")


except:
    _LOGGER.info("Can not run headless mode.")

url = 'https://www.govinfo.gov/app/collection/uscourts/district/alsd/2021/%7B%22pageSize%22%3A%22100%22%2C%22offset%22%3A%220%22%7D'

driver.get(url)
time.sleep(5)

page = driver.page_source
soup = bs(page, "html.parser")


next_page = WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.XPATH,'//*[@id="collapseOne1690"]/div/span[1]/div/ul/li[8]/a')))
if next_page:
    print('*****getting next page*****')
    # driver.execute_script('arguments[0].click()', next_page)
    next_page.click()
    time.sleep(3)
    
else:
    print('no next page')
    

driver.quit()

我收到超时错误。我试过更改 XPath。我试过 ActionChains 滚动到视图中,但没有任何效果。任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:2)

1 您的 XPATH 不起作用,因为它使用动态类名 collapseOne1690,如前所述。 此外,即使您使用了此类名称的一部分,它也不是很稳定。 如果您更喜欢 XPath,我建议您使用这个://span[@class='custom-paginator']//li[@class='next fw-pagination-btn']/a 或只是 //li[@class='next fw-pagination-btn']/a。您还可以使用 css 选择器:.next.fw-pagination-btn

2 我去掉了日志代码,因为它也有一些问题,重新检查一下。

3 5 秒显式等待太小。至少 10 秒,最好是 15 秒。这只是一个建议。

点击按钮并使用 Firefox 的最小可重现代码是:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options as options
from bs4 import BeautifulSoup as bs
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By

options = options()
# options.headless = True

driver = webdriver.Firefox(options=options)

print("Started Browser and Driver")

url = 'https://www.govinfo.gov/app/collection/uscourts/district/alsd/2021/%7B%22pageSize%22%3A%22100%22%2C%22offset%22%3A%220%22%7D'

driver.get(url)

page = driver.page_source
soup = bs(page, "html.parser")
print(soup)

next_page = WebDriverWait(driver, 15).until(
    EC.element_to_be_clickable((By.XPATH, "//span[@class='custom-paginator']//li[@class='next fw-pagination-btn']/a")))
next_page.click()

# driver.quit()

答案 1 :(得分:0)

当我加载这个页面时,div id 是动态分配的。第一次加载页面,id是collapseOne5168,第二次是collapseOne1136

您可能会考虑改用 find_element_by_class_name("next fw-pagination-btn")