如何将by-group plot元素叠加到ggplot2 facets?

时间:2011-07-13 19:35:46

标签: r ggplot2

我的问题与分面有关。在下面的示例代码中,我查看了一些分面的散点图,然后尝试在每个方面叠加信息(在本例中为平均线)。

tl; dr版本是我的尝试失败。我添加的平均线计算所有数据(不尊重facet变量),或者我尝试编写公式并且R抛出错误,然后是关于我母亲的尖锐且特别贬低的评论。

library(ggplot2)

# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)

# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)

# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)

# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)

# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
                      aes(yintercept = mean(hp)),
                      data = mtcars
         )
print(p.with.means)

# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
                                       aes(yintercept = mean(hp) ~ cyl),
                                       data = mtcars
                                     )
print(p.with.non.lake.wobegon.means)

必须是我缺少的一些简单的解决方案。

1 个答案:

答案 0 :(得分:7)

你的意思是这样的:

rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))

p + geom_hline(data=rs,aes(yintercept=mn))

可以使用ggplotstat_*来电中执行此操作,但我必须回过头来修补一下。但通常如果我将摘要添加到分面图中,我会单独计算摘要,然后使用自己的geom添加摘要。

修改

关于原始尝试的几个扩展说明。一般来说,最好将aes调用放在ggplot中,这些调用将在整个情节中持续存在,然后在geom中指定与“基数”不同的不同数据集或美学情节。然后,您无需在每个data = ...中指定geom

最后,我想出了一种巧妙地使用geom_smooth来做类似于你的要求的事情:

p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) + 
    facet_grid(~cyl) + 
    geom_point() + 
    geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")

水平线(即常数回归方程)只会扩展到每个方面的数据限制,但会跳过单独的数据摘要步骤。