我正在为目录网站抓取网址,并且每当下一个按钮不再有效时,我就会一直收到StaleElementException。我试图弄清楚如何防止循环中断。我想在获得其余元素之后以及不再有下一个按钮时退出循环。
with open(provider + '-' + state + '-' + city + '-links.csv', 'w', newline='') as file:
writer = csv.writer(file)
idx = 1
while next_page is not None:
for company in company_links_elements:
company_url = company.get_attribute("href")
writer.writerow((idx, company_url, provider))
idx += 1
time.sleep(random.randint(2, 3))
next_page.click()
# Get next page elements
company_links_elements = driver.find_elements(By.XPATH,
"//h3[@class='jss320 jss324 jss337 sc-gzOgki eucExu']/a")
company_address_elements = driver.find_elements(By.XPATH,
"//p/strong[@class='dtm-search-listing-address']")
# Try getting the next page element
try:
next_page = driver.find_element(By.XPATH, "//a[@role='link'][contains(text(),'Next')]")
except NoSuchElementException:
break
driver.quit()
这是错误:
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
(Session info: chrome=66.0.3359.181)
(Driver info: chromedriver=2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb),platform=Windows NT 10.0.16299 x86_64)
每当我加载一个新的页面时,我试着尝试刹车,如果它不存在则进行故障转移,但这似乎并没有起作用。
答案 0 :(得分:0)
尝试在next_page.click()
之后添加长睡眠/等待以进行调试,以便在点击next page
按钮后等待页面刷新完成