我正在尝试使用Rselarium和Rvest抓取REI的评论(吊床)。我想点击按钮x底部的次数,这样我就可以抓取所有评论。我有点迷路了。到目前为止,这就是我所拥有的。如果您也知道,那么如何在取景器中预览自己正在做的事情(而不是屏幕打印)会很酷。谢谢Stack Community。
replicate(100,
{
remDr$navigate("https://www.amazon.com/Eagles-Nest-Outfitters-DoubleNest-Portable/product-reviews/B00K30GXK8/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviewshttps://www.amazon.com/Eagles-Nest-Outfitters-DoubleNest-Portable/product-reviews/B00K30GXK8/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews")
webElem <- remDr$findElement("css", "body")
webElem$sendKeysToElement(list(key = "end"))
morereviews <- remDr$findElement(using = 'css selector', ".a-last a")
morereviews$clickElement
Sys.sleep(4)
reviews <- xml2::read_html(remDr$getPageSource()[[1]])%>%
rvest::html_nodes(".review-text")%>%
dplyr::data_frame(reviews = .)
})
答案 0 :(得分:0)
尝试一下:
# Click the Load More button
replicate(100,
{
# scroll down
webElem <- remDr$findElement("css", "body")
webElem$sendKeysToElement(list(key = "end"))
# find button
morereviews <- remDr$findElement(using = 'css selector', "#BVRRContainer div.bv-content-pagination-container button")
# click button
morereviews$clickElement()
# wait
Sys.sleep(4)
})
# Scrap the reviews
reviews <- xml2::read_html(remDr$getPageSource()[[1]])%>%
rvest::html_nodes("#BVRRContainer div.bv-content-summary-body-text") %>%
rvest::html_text() %>%
dplyr::data_frame(reviews = .)
reviews