我试图使用R包rvest抓取网页,但是当我使用命令html_node时,它返回一个空列表。问题是什么?这是我的代码(我使用SelectorGadget获取标签):
#SETUP
library(tidyverse)
library(rvest)
#Getting the link for each house
main.link<- "https://www.sreality.cz/en/search/for-sale/apartments/praha"
main.page<-read_html(main.link)
links<- html_nodes(main.page, css=".title .ng-binding")
您可以看到我是R的初学者。在此先感谢您的帮助。
答案 0 :(得分:0)
按照Selcuk Akbas的建议与RSelenium合作
#Loading the rvest package
library(rvest)
library(magrittr) # for the '%>%' pipe symbols
library(RSelenium) # to get the loaded html of
# starting local RSelenium (this is the only way to start RSelenium that is working for me atm)
selCommand <- wdman::selenium(jvmargs = c("-Dwebdriver.chrome.verboseLogging=true"), retcommand = TRUE)
shell(selCommand, wait = FALSE, minimized = TRUE)
remDr <- remoteDriver(port = 4567L, browserName = "chrome")
remDr$open()
#Specifying the url for desired website to be scrapped
main_link<- "https://www.sreality.cz/en/search/for-sale/apartments/praha"
# go to website
remDr$navigate(main_link)
# get page source and save it as an html object with rvest
main_page <- remDr$getPageSource(header = TRUE)[[1]] %>% read_html()
# get the links
links<- html_nodes(main_page, css=".title .ng-binding")
希望有帮助!刮刮乐!