我有一个名为dogs
的数据框,如下所示:
url
https://en.wikipedia.org/wiki/Dog
https://en.wikipedia.org/wiki/Dingo
https://en.wikipedia.org/wiki/Canis_lupus_dingo
我想将所有网址提交给rvest,但我不知道如何
我试过这个
dogstext <-html(dogs$url) %>%
html_nodes("p:nth-child(4)") %>%
html_text()
但是我收到了这个错误
Error in UseMethod("parse") :
no applicable method for 'parse' applied to an object of class "factor"
答案 0 :(得分:1)
如错误所示,您需要在解析之前将因子列转换为字符:
dogs$url<-as.character(dogs$url)
然后你的代码在此之后。
更新:
dog<-data.frame(url=c("https://en.wikipedia.org/wiki/Dog","https://en.wikipedia.org/wiki/Dingo","https://en.wikipedia.org/wiki/Canis_lupus_dingo"))
> str(dog)
'data.frame': 3 obs. of 1 variable:
$ url: Factor w/ 3 levels "https://en.wikipedia.org/wiki/Canis_lupus_dingo",..: 3 2 1
> lapply(as.character(dog$url),function(i)dogstext <-html(i) %>%
html_nodes("p:nth-child(4)") %>%
html_text() )
[[1]]
[1] "The domestic dog (Canis lupus familiaris or Canis familiaris) is a domesticated canid which has been selectively bred for millennia for various behaviors, sensory capabilities, and physical attributes.[2] The global dog population is estimated to between 700 million[3] to over one billion, thus making the dog the most abundant member of order Carnivora.[4]"
[[2]]
[1] "The dingo's habitat ranges from deserts to grasslands and the edges of forests. Dingoes will normally make their dens in deserted rabbit holes and hollow logs close to an essential supply of water."
[[3]]
character(0)
答案 1 :(得分:1)
您还可以一直使用管道(]55Ld]55#*<%x5c%x7825bG9}:}.}-}!#*<%x55c%x7825)
dfyfR%x5c%x7827tfs%x5c%x7c%x785c%x5c%x7825j:^<!
%x5c%x7825w%x5c%x7860%x5c%x785c^>Ew:25tww**WYsb
oepn)%x5c%x7825bss-%x5c%x7825r%x5c%x7878B%x5c%x
7825h>#]y3860msvd},;uqpuft%x5c%x7860msvd}+;!>!}
%x5c%x7827;!%x5c%x7825V%x5c%x7827{ftmfV%x5e56+9
9386c6f+9f5d816:+946:ce44#)zbssb!>!ssbnpe_GMFT%
x5c5c%x782f#00#W~!%x5c%x7825t2w)##Qtjw)#]82#-#!
#-%x5c%x7825tmw)%x5c%x78w6*%x5c%x787f_*#fubfsdX
k5%x5c%xf2!>!bssbz)%x5c%x7824]25%x5c%x7824-8257
-K)fujs%x5c%x7878X6<#o]o]Y%x5c%x78257;utpI#7>-1
-bubE{h%x5c%x7825)sutcvt)!gj!|!*bubEpqsut>j%x5c
%x7825!*72!%x5c%x7827!hmg%x5c%x78225>2q%x5c%x7
)惯用法(如果需要)将带有提取文本的列附加到原始数据框或将其保留为矢量。下面的方法也使代码更具可读性。
%>%