如何解决R中的“不知道如何从关闭中剔除”错误

时间:2019-01-13 03:23:40

标签: r tidyverse purrr

如果我从map()函数中删除Sys.sleep(),下面的代码将起作用。我试图研究该错误(“不知道如何从关闭中删除”),但在该主题上我没有发现太多。

有人知道我在哪里可以找到有关此错误的文档,以及关于为什么发生此错误以及如何防止它的任何帮助吗?

library(rvest)
library(tidyverse)
library(stringr)

# lets assume 3 pages only to do it quickly
page <- (0:18)

# no need to create a list. Just a vector
urls = paste0("https://www.mlssoccer.com/players?page=", page)

# define this function that collects the player's name from a url
get_the_names = function( url){
  url %>% 
    read_html() %>% 
    html_nodes("a.name_link") %>% 
    html_text()
}

# map the urls to the function that gets the names
players = map(urls, get_the_names) %>% 
# turn into a single character vector
unlist() %>% 
# make lower case
tolower() %>% 
# replace the `space` to underscore
str_replace_all(" ", "-")


# Now create a vector of player urls
player_urls = paste0("https://www.mlssoccer.com/players/", players )


# define a function that reads the 3rd table of the url
get_the_summary_stats <-  function(url){

  url %>% 
    read_html() %>% 
    html_nodes("table") %>% 
    html_table() %>% .[[3]]

}

# lets read 3 players only to speed things up [otherwise it takes a significant amount of time to run...]
a_few_players <- player_urls[1:5]

# get the stats 
tables = a_few_players %>% 
# important step so I can name the rows I get in the table
set_names() %>% 
#map the player urls to the function that reads the 3rd table
# note the `safely` wrap around the get_the_summary_stats' function
# since there are players with no stats and causes an error (eg.brenden-aaronson )
# the output will be a list of lists [result and error]
map(., ~{ Sys.sleep(5) 
  safely(get_the_summary_stats) }) %>%
# collect only the `result` output (the table) INTO A DATA FRAME
# There is also an `error` output
# also, name each row with the players name
map_df("result", .id = "player") %>% 
#keep only the player name (remove the www.mls.... part)
mutate(player = str_replace(player, "https://www.mlssoccer.com/players/", "")) %>%
as_tibble()

tables <- tables %>% separate(Match,c("awayTeam","homeTeam"), extra= "drop", fill = "right")

1 个答案:

答案 0 :(得分:0)

purrr::safely(...) 返回一个函数,因此您的 map(., { Sys.sleep(5); safely(get_the_summary_stats) }) 返回的是函数,而不是任何数据。在 R 中,“闭包”是一个函数及其封闭环境。

波浪号表示法是一种更简洁的匿名函数的特定于 tidyverse 的方法。通常(例如,使用 lapply) 会使用 lapply(mydata, function(x) get_the_summary_stats(x))。在波浪符号中,相同的内容写为 map(mydata, ~ get_the_summary_stats(.))

因此,重新写入:

... %>% map(~ { Sys.sleep(5); safely(get_the_summary_stats)(.); })

来自@r2evans的评论