Question

我正在ggplot2做一些项目的分析，偶然我偶然发现了一些（对我来说）奇怪的行为，我无法解释。当我写aes(x = cyl, ...)时，如果我使用aes(x = mtcars$cyl, ...)传递相同的变量，情节看起来与它的作用不同。当我删除facet_grid(am ~ .)时，两个图表再次相同。下面的代码是在我的项目中生成相同行为的代码之后建模的：

library(dplyr)
library(ggplot2)

data = mtcars

test.data = data %>%
  select(-hp)


ggplot(test.data, aes(x = test.data$cyl, y = mpg)) +
  geom_point() + 
  facet_grid(am ~ .) +
  labs(title="graph 1 - dollar sign notation")

ggplot(test.data, aes(x = cyl, y = mpg)) +
  geom_point()+ 
  facet_grid(am ~ .) +
  labs(title="graph 2 - no dollar sign notation")

以下是图1的图片：

graph 1 - dollar sign notation

以下是图2的图片：

graph 2 - no dollar sign notation

我发现我可以使用aes_string而不是aes来解决这个问题，并将变量名称作为字符串传递，但我想了解ggplot为什么会这样做。使用facet_wrap的类似尝试也会出现此问题。

提前获得任何帮助很多！如果我不理解这一点，我会感到非常不舒服......

Answer 1

tl; dr

从不在[内使用$或aes()。

考虑这个说明性示例，其中构面变量f故意以x

的非显而易见的顺序

d <- data.frame(x=1:10, f=rev(letters[gl(2,5)]))

现在对比这两个情节发生的情况，

p1 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(x, y=0, label=x, colour=f)) +
  ggtitle("good mapping") 

p2 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(d$x, y=0, label=x, colour=f)) +
  ggtitle("$ corruption")

通过查看ggplot2为每个面板内部创建的data.frame，我们可以更好地了解正在发生的事情，

 ggplot_build(p1)[["data"]][[1]][,c("x","PANEL")]

    x PANEL
1   6     1
2   7     1
3   8     1
4   9     1
5  10     1
6   1     2
7   2     2
8   3     2
9   4     2
10  5     2

 ggplot_build(p2)[["data"]][[1]][,c("x", "PANEL")]

    x PANEL
1   1     1
2   2     1
3   3     1
4   4     1
5   5     1
6   6     2
7   7     2
8   8     2
9   9     2
10 10     2

第二个绘图的映射错误，因为当ggplot为每个面板创建一个data.frame时，它会以“错误”的顺序选择x值。

这是因为使用$打破了要映射的各种变量之间的链接（ggplot必须假设它是一个独立的变量，对于它所知道的所有变量都可以来自任意的，断开连接的源）。由于此示例中的data.frame未按因子f排序，因此每个面板内部使用的子集data.frames假定顺序错误。

使用美元符号表示法（$）将变量与facet_grid（）或facet_wrap（）一起传递给aes（）时出现问题

1 个答案: