应用错误收集

我目前正在尝试从维基百科页面中提取URL以及首席执行官列表，然后代码打开URL并将文本复制到.txt文件供我使用。问题是allurls对象只包含URL的后半部分。例如，allurls[1]提供＆＃34;＆＃34; / wiki / Pierre_Nanterme＆＃34;＆＃34; 。因此，当我运行此代码时

library("xml2")
library("rvest")

url <- "https://en.wikipedia.org/wiki/List_of_chief_executive_officers"

allurls <- url %>% read_html() %>% html_nodes("td:nth-child(2) a") %>% 
html_attr("href") %>% 
  .[!duplicated(.)]%>%lapply(function(x) 
read_html(x)%>%html_nodes("body"))%>%  
  Map(function(x,y) 
write_html(x,tempfile(y,fileext=".txt"),options="format"),.,
      c(paste("tmp",1:length(.))))

allurls[1]

我收到以下错误：

错误：＆＃39; / wiki / Pierre_Nanterme＆＃39;不存在。

在R中从维基百科中删除网址只会产生一半的网址

0 个答案: