Question

我写了一个小程序。在这里我搜寻Google搜索网站，并希望所有URL都在Google搜索网页上。但是我在O / P中得到了character（0）。请帮助我。

代码-

library("rvest")
r_h  = read_html("https://www.google.com/search?q=google&oq=google&aqs=chrome.0.69i59j0l2j69i60l2j69i65.1101j0j7&sourceid=chrome&ie=UTF-8")
d  =  r_h %>% html_nodes(".iUh30") %>% html_text() %>% as.character()

Answer 1

该类在返回的html中不存在。您需要使用其他选择器策略，然后提取href

library(rvest)
library(stringr)
r_h  = read_html("https://www.google.com/search?q=google&oq=google&aqs=chrome.0.69i59j0l2j69i60l2j69i65.1101j0j7&sourceid=chrome&ie=UTF-8")
d  =  r_h %>% html_nodes(".jfp3ef > a") %>% html_attr(., "href")

for(i in d){
  res <- str_match_all(i,'(http.*?)&')
  print(res[[1]][,2])
}

在抓取R时未获得预期的输出

1 个答案: