Question

我正在循环中寻找与 next 等效的purrr :: map_df调用。

map_df可以很好地处理NULL数据帧（如下面的示例所示），因此当我在下面的示例中设置Result <- NULL时它可以工作。

任何人都可以为下面的插图提出一个通用的解决方案，该解决方案不需要我设置Result <- NULL，而是立即进行“下一步”操作。

library(tidyverse)
set.seed(1000)

df <- data.frame(x = rnorm(100), y = rnorm(100), z = rep(LETTERS, 100))

Map_Func <- function(df) {

  Sum_Num <- suppressWarnings(sqrt(sum(df$y)))

  if( Sum_Num == "NaN" ) {

    Result <- NULL
    # I would like to have an equivalent to "next" here... 

    } else {

  Result <- df %>% filter(y == max(y)) %>% mutate(Result = x*y)

}

Result

}

Test <- split(df, df$z) %>% map_df(~Map_Func(.))

在上面的代码中，我可以用什么代替丑陋的if语句中的Result <- NULL（即，我想简单地检查条件并有效地执行“下一步”）。

Answer 1

要退出功能，可以使用return(<output>)命令。这将立即以您定义的输出退出函数。下面给出了与示例代码相同的输出。

library(tidyverse)
set.seed(1000)

df <- data.frame(x = rnorm(100), y = rnorm(100), z = rep(LETTERS, 100))

Map_Func <- function(df) {

  Sum_Num <- suppressWarnings(sqrt(sum(df$y)))

  if( Sum_Num == "NaN" ) {

    return(NULL)

  } 

  Result <- df %>% filter(y == max(y)) %>% mutate(Result = x*y)
}

Test <- split(df, df$z) %>% map_df(~Map_Func(.))

Answer 2

逻辑上的解决方案与OP并不是完全不同的解决方案，而是尝试通过使用单独的功能保持其清洁。 custom_check功能是检查每个组的条件。使用map_if我们仅在Map_Func_true返回custom_check时应用函数TRUE，否则应用Map_Func_false返回NULL并最终绑定行的函数。

library(tidyverse)

Map_Func_true <- function(df) {
  df %>% filter(y == max(y)) %>% mutate(Result = x*y)
}

Map_Func_false <- function(df) { return(NULL) }

custom_check <- function(df) {
    !is.nan(suppressWarnings(sqrt(sum(df$y))))
}


df %>%
  group_split(z) %>%
  map_if(., custom_check, Map_Func_true, .else = Map_Func_false) %>%
  bind_rows()


# A tibble: 26 x 4
#       x     y z     Result
#   <dbl> <dbl> <fct>  <dbl>
# 1  1.24  2.00 A       2.47
# 2  1.24  2.00 A       2.47
# 3  1.24  2.00 C       2.47
# 4  1.24  2.00 C       2.47
# 5  1.24  2.00 E       2.47
# 6  1.24  2.00 E       2.47
# 7  1.24  2.00 G       2.47
# 8  1.24  2.00 G       2.47
# 9  1.24  2.00 I       2.47
#10  1.24  2.00 I       2.47
# … with 16 more rows

Answer 3

这是使用purrr::safely

进行查看的另一种方式

Map_Func <- function(df) {

  Sum_Num <- suppressWarnings(sqrt(sum(df$y)))

  df %>% filter(y == max(y)) %>% mutate(Result = x*y)

}

Test <- split(df, df$z) %>% 
  map(safely(~Map_Func(.))) %>% 
  transpose() %>% 
  pluck("result") %>% # use 'error' here to get the error log
  bind_rows()

这样，该功能将变得更加简洁，并且您还会得到一个不错的错误日志

等价于purrr :: map_df中的next

3 个答案: