当引用变量名时,NSE lazyeval :: lazy与substitute

时间:2015-08-21 14:49:36

标签: r dplyr lazy-evaluation

我仍然试图围绕非标准评估及其在dplyr中的使用方式。当函数参数是变量名时,我很难理解为什么惰性求值很重要,因此原始上下文的环境看起来并不重要。

在下面的代码中,函数select3()使用延迟评估,但失败(我相信),因为它尝试跟随变量名order一直到base::order

我可以在我的select4()中使用替代品,还是有其他方法可以实现此功能?什么时候保存原始环境实际上很重要,何时我真的希望这些参数引用变量?

谢谢!

library(dplyr)
library(lazyeval)
# Same as dplyr::select
select2 <- function(.data, ...) {
  select_(.data, .dots = lazy_dots(...))
}

# I want to have two capture groups of variables, so I need named arguments.
select3 <- function(.data, group1, group2) {
  out1 <- select_(.data, .dots = lazy(group1))
  out2 <- select_(.data, .dots = lazy(group2))

  list(out1, out2)
}


df <- data.frame(x = 1:2, y = 3:4, order = 5:6)

# select3 seems okay at first...
df %>% select2(x, y)
df %>% select3(x, y)


# But fails when the variable is a function defined in the namespace
df %>% select2(x, order)
df %>% select3(x, order)
# Error in eval(expr, envir, enclos) : object 'datafile' not found


# Using substitute instead of lazy works. But I'm not sure I understand the 
# implications of doing this.
select4 <- function(.data, group1, group2) {
  out1 <- select_(.data, .dots = substitute(group1))
  out2 <- select_(.data, .dots = substitute(group2))

  list(out1, out2)
}

df %>% select4(x, order)

PS 在相关的说明中,这是一个错误还是预期的行为?

select(df, z)
# Error in eval(expr, envir, enclos) : object 'z' not found

# But if I define z as a numeric variable it works.
z <- 1
select(df, z)

更新

一个。 Webb在下面指出环境对select很重要,因为像one_of这样的特殊函数可以使用它中的对象。

更新2

我曾经有一个丑陋的黑客作为修复,但这是一个更好的方法;我应该知道即使lazy有一个标准的评估对手lazy_

select6 <- function(.data, group1, group2) {
  g1 <- lazy_(substitute(group1), env = parent.frame())
  g2 <- lazy_(substitute(group2), env = parent.frame())

  out1 <- select_(.data, .dots = g1)
  out2 <- select_(.data, .dots = g2)

  list(out1, out2)
}

# Or even more like the original...

lazy_parent <- function(expr) {
  # Need to go up twice, because lazy_parent creates an environment for itself
  e1 <- substitute(expr)
  e2 <- do.call("substitute", list(e1), envir = parent.frame(1))

  lazy_(e2, parent.frame(2))
}

select7 <- function(.data, group1, group2) {
  out1 <- select_(.data, .dots = lazy_parent(group1))
  out2 <- select_(.data, .dots = lazy_parent(group2))

  list(out1, out2)
}

1 个答案:

答案 0 :(得分:2)

这里的问题是默认情况下lazy遵循promises,而order是由于延迟加载包而导致的承诺。

library(pryr)
is_promise(order)
#> TRUE

lazy_dots中使用的select默认值相反。

但是这里还有其他东西,其中特殊...的性质用于提取未评估的表达式。虽然您使用替代品可以在许多情况下使用,但尝试通过select重命名的尝试将会失败。

select4(df,foo=x,bar=order)
#> Error in select4(df, foo = x, bar = order) : 
#>   unused arguments (foo = x, bar = order)

然而,这有效

select5 <- function(.data, ...) {
  dots<-lazy_dots(...)
  out1 <- select_(.data, .dots=dots[1])
  out2 <- select_(.data, .dots=dots[2])
  list(out1, out2)
}


select5(df,foo=x,bar=order)
#> [[1]]  
#>   foo
#> 1   1
#> 2   2
#> 
#> [[2]]
#>   bar
#> 1   5
#> 2   6

作为另一个例子,由于缺乏环境,substitute更直接失败,请考虑

vars<-c("x","y") 

select4(df,one_of(vars),order)
#>Error in one_of(vars, ...) : object 'vars' not found

select5(df,one_of(vars),order)
#> [[1]]
#>   x y
#> 1 1 3
#> 2 2 4
#> 
#> [[2]]
#>   order
#> 1     5
#> 2     6

select4版本失败,因为找不到vars select5lazy_dots携带环境而成功select4(df,one_of(c("x","y")),order)。注意array.each 是可以的,因为它使用文字。