在评估R中的表达式之前,如何获取需要定义的所有变量?

时间:2015-09-02 21:22:46

标签: r

如果我有表达式:

z <- x+y

xy需要定义,而z则不需要。我如何才能获得xy

missing_values('z<-x+y');
[1] "x" "y"
需要删除

z,以便在评估表达式之前向用户建议需要定义哪些值。

2 个答案:

答案 0 :(得分:3)

这是一个可能的解决方案,假设多参数函数从左到右评估它们的参数。这对于典型的二元运算符来说是正确的,这是你似乎感兴趣的,但正如我在@ BenBolker的回答中所指出的那样,并非普遍存在。

find_unbound <- function(lang) {
  stopifnot(is.language(lang), !is.expression(lang))

  bound <- character()
  unbound <- character()

  rec_fun <- function(lang) {
    if(is.call(lang)) {
      # These are assignment calls; if any symbols are assigned and have
      # not already been found in a leaf, they are defined as bound

      if((lang[[1]] == as.name("<-") || lang[[1]] == as.name("="))) {
        for(i in as.list(lang)[-(1:2)]) Recall(i)
        if(is.name(lang[[2]]) && !as.character(lang[[2]]) %in% unbound)
          bound <<- c(bound, as.character(lang[[2]]))
      } else for(i in as.list(lang)[-1]) Recall(i)                
    } else if (is.name(lang) && ! as.character(lang) %in% bound)
      # this is a leaf; any previously bound symbols are by definition
      # unbound

      unbound <<- c(unbound, as.character(lang))
  }
  rec_fun(lang)
  unique(unbound)
}

find_unbound一直递归到表达式的叶子,以确定每个符号是否已被绑定。以下是一些测试说明:

find_unbound(quote(z <- x + y))
# [1] "x" "y"
find_unbound(quote(z <- x + (y <- 3)))
# [1] "x"
find_unbound(quote(z <- z + 1))           # we even figure out `z` depends on itself, so must be provided
# [1] "z"
find_unbound(quote(z <- x + (x <- 3)))    # note `x` is evaluated before `x <- 3`
# [1] "x"
find_unbound(quote(z <- (x <- 3) + x))    # but here `x <- 3` is evaluated before `x`
# character(0)
find_unbound(quote({x <- 3; z <- x + y})) # works with multiple calls
# [1] "y"
find_unbound(quote({z <- x + y; x <- 3})) # order matters, just like in R evaluation
# [1] "x" "y"

答案 1 :(得分:2)

你在问你如何忽略表达式的左侧。

txt <- "z <- x+y"
p <- parse(text=txt)

由于我不理解R返回expression(z <- x+y)的原因 - 我们需要删除&#34;表达式&#34;第一部分:

p2 <- p[[1]]

然后我们可以从右侧获取变量:

all.vars(p2[[3]])
## [1] "x" "y"    

只要您不评估已解析的文字,我认为您应该安全地向用户输入以下内容:

txt <- "z <- system('rm -Rf *')"

...

我认为双方都会变得相当复杂,这仍然会有效,因为<-运算符的优先级相当低:

txt <- "names(x)[1] <- as.character(a * log(x))"
all.vars(parse(text=txt)[[1]][[3]])
## [1] "a" "x"