Question

我有一个带有多个y系列的df，我想单独绘制，所以我写了一个fn，选择一个特定的系列，分配给局部变量dat，然后绘制它。但是在fn中调用ggplot / geom_step并不像单个系列那样正确对待它。我不知道这可能是一个范围问题，因为如果dat不可见，ggplot肯定会失败吗？

您可以在从顶层环境执行时验证代码是否正确，但不能在函数内部验证。这不是一个重复的问题。我理解这个问题（这是ggplot的一个反复出现的问题），但我已经阅读了所有其他答案;这不是重复，他们不提供解决方案。 geom_step doesn't display it properly like a single series

set.seed(1234)
require(ggplot2)
require(scales)

N = 10
df <- data.frame(x = 1:N,
                 id_ = c(rep(20,N), rep(25,N), rep(33,N)),
                 y = c(runif(N, 1.2e6, 2.9e6), runif(N, 5.8e5, 8.9e5) ,runif(N, 2.4e5, 3.3e5)),
                 row.names=NULL)

plot_series <- function(id_, envir=environment()) {
  dat <- subset(df,id_==id_)
  p <- ggplot(data=dat, mapping=aes(x,y), color='red') + geom_step()
  # Unsuccessfully trying the approach from http://stackoverflow.com/questions/22287498/scoping-of-variables-in-aes-inside-a-function-in-ggplot
  p$plot_env <- envir
  plot(p)
  # Displays wrongly whether we do the plot here inside fn, or return the object to parent environment 
  return(p)
}

 # BAD: doesn't plot geom_step!
plot_series(20)

# GOOD! but what's causing the difference?
ggplot(data=subset(df,id_==20), mapping=aes(x,y), color='red') + geom_step()

#plot_series(25)
#plot_series(33)

Answer 1

这很好用：

plot_series <- function(id_) {
    dat <- df[df$id_ == id_,]
    p <- ggplot(data=dat, mapping=aes(x,y), color='red') + geom_step()
    return(p)
}

print(plot_series(20))

如果您只是使用debug单步执行原始函数，那么您很快就会发现subset行根本没有实际对数据框进行子集化：它返回了所有行！

为什么呢？因为subset使用非标准评估，并且您对列名和函数参数使用了相同的名称。正如jlhoward在上面所说明的那样，仅仅为两者使用不同的名称就行了（但可能不太可取）。

原因是subset首先评估数据框。因此，它在逻辑表达式中看到的只是该数据框中始终为真id_ == id_。

考虑它的一种方法是玩愚蠢（如计算机）并问自己什么时候出现条件id_ == id_你怎么知道每个符号究竟是指什么。这是不明确的，subset做出了一致的选择：使用数据框中的内容。

Answer 2

尽管有这些意见，但仍有效：

plot_series <- function(z, envir=environment()) {
  dat <- subset(df,id_==z)
  p <- ggplot(data=dat, mapping=aes(x,y), color='red') + geom_step()
  p$plot_env <- envir
  plot(p)
  # Displays wrongly whether we do the plot here inside fn, or return the object to parent environment 
  return(p)
}

plot_series(20)

问题似乎是子集在id_的RHS上解释==与LHS相同，这相当于T上的转租，当然包括df的所有行。这就是你所看到的阴谋。

绘制内部函数：子集（df，id _ == ...）给出错误的绘图，df [df $ id _ == ...，]是对的

2 个答案: