Question

我使用硒来抓取this网站。首先，我点击了吸引力类型旁边的清除按钮。然后我点击了类别列表底部的更多链接。现在每个人都按ID找到元素，然后点击链接。问题是当我点击第一类户外活动时，网站再次回到初始状态，当我尝试点击下一个链接时出现以下错误：

StaleElementReferenceException: Message: Element is no longer attached to the DOM

我的代码是：

class TripSpider(CrawlSpider):
  name = "tspider"
  allowed_domains = ["tripadvisor.ca"]
  start_urls = ['http://www.tripadvisor.ca/Attractions-g147288-Activities-c42-Dominican_Republic.html']

  def __init__(self):
    self.driver = webdriver.Firefox()
    self.driver.maximize_window()


  def parse(self, response):
    self.driver.get(response.url)
    self.driver.find_element_by_class_name('filter_clear').click()
    time.sleep(3)
    self.driver.find_element_by_class_name('show').click()
    time.sleep(3)
    #to handle popups
    self.driver.switch_to.window(browser.window_handles[-1])
    # Close the new window
    self.driver.close()
    # Switch back to original browser (first window)
    self.driver.switch_to.window(browser.window_handles[0])
    divs = self.driver.find_elements_by_xpath('//div[contains(@id,"ATTR_CATEGORY")]')
    for d in divs:
      d.find_element_by_tag_name('a').click()
      time.sleep(3)

Answer 1

这个网站的问题尤其在于，每次点击一个元素时DOM都会发生变化，所以你不能遍历那些过时的元素。

我很久以前遇到同样的问题，我使用不同的窗口为每个链接解决了这个问题。

您可以更改此部分代码：

divs = self.driver.find_elements_by_xpath('//div[contains(@id,"ATTR_CATEGORY")]')
for d in divs:
    d.find_element_by_tag_name('a').click()
    time.sleep(3)

有关：

from selenium.webdriver.common.keys import Keys
mainWindow = self.driver.current_window_handle
divs = self.driver.find_elements_by_xpath('//div[contains(@id,"ATTR_CATEGORY")]')
for d in divs:
    # Open the element in a new Window
    d.find_element_by_tag_name('a').send_keys(Keys.SHIFT + Keys.ENTER)
    self.driver.switch_to_window(self.driver.window_handles[1])

    # Here you do whatever you want in the new window

    # Close the window and continue
    self.driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 'w')
    self.driver.switch_to_window(mainWindow)

点击站点上的Selenium返回到类似的状态和陈旧错误

1 个答案: