Question

我正在使用Selenium在Python中构建各种Web抓取工具。我正在迭代循环，每次迭代的开始将返回到主页面。在第一次迭代中，一切都无缝地工作。但是，在第二次迭代中，它将在最后一行抛出StaleElementReferenceException：

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document

这是我的代码：

    for link in links:
        #Go to MP3 converter homepage
        self.__driver.get("http://convert2mp3.net/en/")         

        #Type in the video link
        urlinput = self.__driver.find_element_by_id("urlinput") 

        #TEST
        print(str(urlinput))

        self.__action.send_keys_to_element(urlinput, "https://www.youtube.com" + link + Keys.ENTER).perform()

代码中的测试是证明它实际上能够从页面获取元素。它打印出以下内容：

<selenium.webdriver.remote.webelement.WebElement (session="509fc04e4130a25f46f6068684b97a1a", element="0.9812681457094412-1")>
<selenium.webdriver.remote.webelement.WebElement (session="509fc04e4130a25f46f6068684b97a1a", element="0.36225331932442084-1")>

因此，正如您所看到的，它经历了几乎两次完整的迭代，但会在第二次迭代的最后一行崩溃。

以前，我还在测试用例中输出整个页面源文本文件。事实上，该元素在到达错误行时被加载，而我正在获取的元素实际上是在该源文件中。我不确定为什么一次后它无法工作。

编辑：评论中有人要我展示我如何定义链接：

    yt_url = req.urlopen(vid_link)
    #Note here that 'sopa' is BeautifulSoup
    yt_page = sopa(yt_url, "html.parser")

    #Get all links
    temp_links = yt_page.find_all("a", href = True)
    links = []

    #Filter playlist to get just the video links
    for l in temp_links:
        if l["href"] not in links and "index" in l["href"] and  l["href"].startswith("/watch"):
            links.append(l["href"])

    return links

Answer 1

<强>解决方案：

经过一些进一步的修补，我找到了答案。在我的for循环中，我确实再次找到了元素，但我没有意识到Selenium还希望我再次更新ActionChain。我在构造函数中初始化了它，但是我在for循环的开头添加了以下代码行：

        self.__action = chain.ActionChains(self.__driver)

它有效。

我曾经假设，因为我第一次通过了对self .__驱动程序的引用，每当我从驱动程序获得一个新的Web地址时它就会更新信息，但这似乎不是Selenium做事情的方式。

为什么我得到StaleElementReferenceException，即使它在页面上找到了元素？

1 个答案: