我的问题:
我正在尝试抓取网页
#start selenium
selServ <- selenium()
# check ports
selServ$log()$stderr
# conect using port
sel <- remoteDr(browserName = "chrome", port = 4567)
# go to the URL
sel %>%
go("https://jurispub.admin.ch/publiws/pub/search.jsf")
# hit search button
sel %>%
findElement("name", "form:searchSubmitButton") %>% # find the submit button
elementClick() # click it
接下来,我使用循环获取一些数据,在循环结束时,我让Selenium单击下一个按钮:
i <- 1
while (i <= nr+1) {
Sys.sleep(0.5)
sel %>%
getPageSource() %>% # like read_html()
html_node("table.iceDatTbl") -> dtbl # this is the data table
sel %>%
findElement("xpath", ".//img[contains(@src, 'arrow-next')]/../../a") %>%
elementClick() # go to next page}
但是,经过一定时间(循环次数随机)后,我收到此错误消息:
Error detected:
Response status code : 200
Selenium Status code: 10
Selenium Status summary: StaleElementReference
Selenium Status detail: An element command failed because the referenced element is no longer attached to the DOM.
Selenium message: stale element reference: element is not attached to the page document
(Session info: chrome=69.0.3497.100)
(Driver info: chromedriver=70.0.3538.16 (16ed95b41bb05e565b11fb66ac33c660b721f778),platform=Windows NT 10.0.17134 x86_64)
Please check the response with errorResponse()
Please check the content returned with errorContent()
> errorContent()
[1] "stale element reference: element is not attached to the page document\n
我的方法
我认为单击下一个按钮会导致错误,因为这是循环中的唯一单击
sel %>%
findElement("xpath", ".//img[contains(@src, 'arrow-next')]/../../a") %>%
elementClick() # go to next page
我找到了这篇文章,我认为我有一个类似的问题: RSelenium throwing StaleElementReference error
但是,我不知道如何在我的代码中实现它。
更新
这与StaleElementReference Exception in PageFactory略有不同
因为我已经在findElement()
之前实现了elementClick()
我的问题更类似于RSelenium cannot access DOM
,我尝试实施此解决方案,即 JavaScript点击:
# go to next page
sel %>% findElement("xpath", ".//img[contains(@src, 'arrow-next')]") -> resu
executeScript(sel,"arguments[0].click();", args=list(resu))
但是,我仍然收到“随机的” StaleElementReference错误..(不那么频繁):,( 我的代码中还有其他内容会导致此错误吗?