"救援" R中的命令?

时间:2014-12-05 20:58:25

标签: r web-scraping internal-server-error rvest

我有这段代码:

library(rvest)
url_list <- c("https://github.com/rails/rails/pull/100", 
               "https://github.com/rails/rails/pull/200", 
               "https://github.com/rails/rails/pull/300")

mine <- function(url){
  url_content  <- html(url)
  url_mainnode <- html_node(url_content, "*")
  url_mainnode_text <- html_text(url_mainnode)
  url_mainnode_text <- gsub("\n", "", url_mainnode_text) # clean up the text
  url_mainnode_text
}

messages <- lapply(url_list, mine)

然而,随着我使列表更长,我倾向于遇到

Error in html.response(r, encoding = encoding) : 
  server error: (500) Internal Server Error 

我知道在Ruby中我可以使用rescue继续遍历列表,即使某些函数尝试失败也是如此。 R中有类似的东西吗?

1 个答案:

答案 0 :(得分:2)

一种选择是使用try()。有关详细信息,请参阅here。这是一个实现:

library(rvest)
url_list <- c("https://github.com/rails/rails/pull/100", 
               "https://github.com/rails/rails/pull/200", 
               "https://github.com/rails/rails/pull/300")

mine <- function(url){
  try(url_content  <- html(url))
  url_mainnode <- html_node(url_content, "*")
  url_mainnode_text <- html_text(url_mainnode)
  url_mainnode_text <- gsub("\n", "", url_mainnode_text) # clean up the text
  url_mainnode_text
}

messages <- lapply(url_list, mine)