Question

假设我要运行一个循环，直到满足条件为止，此时将保存结果并退出循环：

library(tidyverse)

for (i in 1:5) {

  df <- iris %>% select(i) %>% head(2)

  if (names(df) == "Petal.Width") {
    out <- df
    break 

  }
}

out

如何使用purr::map重写此代码而不必评估每个i？

执行以下操作可获得所需的结果，但必须求值5次，而for循环仅求3次：

fun <- function(x) {

  df <- iris %>% select(x) %>% head(2)

  if (names(df) == "Petal.Width") {
  return(df)
  }
}

map_df(1:5, fun)

Answer 1

没有等效项。实际上，使map（及类似函数）在可读性方面优于一般循环的一件事是，它们具有绝对可预测的行为：它们将对每个元素仅执行一次函数，没有例外（例外，嗯，如果有例外：您可以通过stop提出条件以使执行短路，但这很少是可取的。）

相反，您的案子并不需要map，而是需要类似purrr::keep或purrr::reduce的东西。

这样想：map，reduce等是抽象，它们对应于更一般的for循环的特定特殊情况。他们的目的是弄清正在处理哪种特殊情况。作为程序员，您的任务便是找到 right 抽象。

在您的特定情况下，我可能会使用dplyr完全重写该语句，因此很难提供“最佳” purrr解决方案：最佳解决方案是不使用purrr。也就是说，您可以按以下方式使用purrr::detect：

names(iris) %>%
    detect(`==`, 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

或

seq_along(iris) %>%
    detect(~ names(iris[.x]) == 'Sepal.Width') %>%
    `[`(iris, .) %>%
    head(2)

...但实际上，这里是dplyr供比较：

iris %>%
    select(Sepal.Width) %>%
    head(2)

Answer 2

1） callCC可用于获得此效果：

callCC(function(k) {
  fun2 <- function(x) {
    print(x) # just to show that x = 5 is never run
    df <- iris %>% select(x) %>% head(2)
    if (names(df) == "Petal.Width") k(df)
  }
  map_df(1:5, fun2)
})

给予：

[1] 1
[1] 2
[1] 3
[1] 4
  Petal.Width
1         0.2
2         0.2

1a）如果重要的是使用fun而不进行更改，那么请尝试以下操作：

callCC(function(k) map_df(1:5, ~ if (!is.null(df <- fun(.x))) k(df)))

2）purrr :: reduce ：另一种方法是使用purrr中的reduce（或基数R中的Reduce）：

f <- function(x, y) if (is.null(x)) fun(y) else x
reduce(1:5, f, .init = NULL)

从它仍将涉及遍历1：5的每个元素的角度来看，它不如（1）和（1a）那样好，而只会为1：4调用fun。相反，（1）和（1a）实际上在4上运行fun或fun2之后返回。

在purrr :: map中相当于`break`

2 个答案: