Question

我目前正在尝试进行Selenium网络抓取，但我一直遇到此错误：

StaleElementReferenceException：消息：陈旧元素引用：元素未附加到页面文档中

该代码应该在http://www.grownjkids.gov/ParentsFamilies/ProviderSearch上连续单击结果的下一个按钮（“>”），并从循环中的每一页抓取结果。它将在几个页面上正确执行此操作，但偶尔会在随机页面上失败，但以上情况除外。

我已经看过许多有类似担忧的StackOverflow帖子，并尝试了一些建议的修复程序，例如使用WebDriverWait类实现显式等待，使用try / except块循环并使用driver.find_element重新引用元素...方法，前提是发生StaleElementReferenceException并尝试

driver.find_element_by_id

和

driver.find_element_by_xpath。

下面是我的代码：

url = "http://www.grownjkids.gov/ParentsFamilies/ProviderSearch"
driver = webdriver.Chrome('MY WEBDRIVER FILE PATH')
driver.implicitly_wait(10)

driver.get(url)

#clears text box 
driver.find_element_by_class_name("form-control").clear()

#clicks on search button without putting in any parameters, getting all the results
search_button = driver.find_element_by_id("searchButton")
search_button.click()

#function to find next button 
def find(driver):
    try:
        element = driver.find_element_by_class_name("next")
        if element: 
            return element
    except StaleElementReferenceException:
            while (attempts < 100):
                element = driver.find_element_by_class_name("next")
                if element: 
                    return element
                attempts += 1

#keeps on clicking next button to fetch each group of 5 results 
while True: 
    try: 
        nextButton = WebDriverWait(driver, 2000).until(find)
    except NoSuchElementException:
        break
    nextButton.send_keys('\n') 
    table = driver.find_element_by_id("results")
    html_source = table.get_attribute('innerHTML')
    print html_source

我有预感将WebDriverWait增加到2000，并且循环100次尝试实际上是行不通的（也许是没有进入那个块？），因为无论增加多少，结果都是相同的。由于我是第一次使用Selenium，因此我对代码的任何反馈也将受到赞赏，并且我对python也相当陌生。

Answer 1

StaleElementReferenceException在Web驱动程序尝试对不再存在或无效的元素执行操作时发生。

我在您的代码中添加了流畅的等待，以使元素可供点击，请尝试以下代码：

from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import StaleElementReferenceException, WebDriverException, NoSuchElementException
from selenium.webdriver.common.by import By

driver= webdriver.Chrome('C:\NotBackedUp\chromedriver.exe')
url = "http://www.grownjkids.gov/ParentsFamilies/ProviderSearch"
driver.get(url)

#clears text box 
driver.find_element_by_class_name("form-control").clear()

#clicks on search button without putting in any parameters, getting all the results
search_button = driver.find_element_by_id("searchButton")
search_button.click()

#keeps on clicking next button to fetch each group of 5 results 
i=1
while True:
    wait = WebDriverWait(driver, timeout=1000, poll_frequency=1, ignored_exceptions=[StaleElementReferenceException, WebDriverException]);
    try:
        element = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'next')))
        element.click()
        print("Clicked ===> ", i)
        i+=1
    except NoSuchElementException:
            break

    table = driver.find_element_by_id("results")
    html_source = table.get_attribute('innerHTML')
    print html_source

流利的等待会尝试通过忽略StaleElementReferenceException和WebDriverException例外点击下一个符号。当你得到NoSuchElementException异常例外

循环将打破。

希望对您有帮助...

Answer 2

StaleElementReferenceException通常在您尝试与元素交互时发生，而不是在最初找到它时发生。

将您与元素的交互包装在一个尝试异常中，该异常会捕获StaleElementReferenceException。

无法修复StaleElementReferenceException（元素未附加到文档中）

2 个答案: