我的问题与分面有关。在下面的示例代码中,我查看了一些分面的散点图,然后尝试在每个方面叠加信息(在本例中为平均线)。
tl; dr版本是我的尝试失败。我添加的平均线计算所有数据(不尊重facet变量),或者我尝试编写公式并且R抛出错误,然后是关于我母亲的尖锐且特别贬低的评论。
library(ggplot2)
# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)
# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)
# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)
# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)
# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
aes(yintercept = mean(hp)),
data = mtcars
)
print(p.with.means)
# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
aes(yintercept = mean(hp) ~ cyl),
data = mtcars
)
print(p.with.non.lake.wobegon.means)
必须是我缺少的一些简单的解决方案。
答案 0 :(得分:7)
你的意思是这样的:
rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))
p + geom_hline(data=rs,aes(yintercept=mn))
可以使用ggplot
在stat_*
来电中执行此操作,但我必须回过头来修补一下。但通常如果我将摘要添加到分面图中,我会单独计算摘要,然后使用自己的geom
添加摘要。
修改强>
关于原始尝试的几个扩展说明。一般来说,最好将aes
调用放在ggplot
中,这些调用将在整个情节中持续存在,然后在geom
中指定与“基数”不同的不同数据集或美学情节。然后,您无需在每个data = ...
中指定geom
。
最后,我想出了一种巧妙地使用geom_smooth
来做类似于你的要求的事情:
p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) +
facet_grid(~cyl) +
geom_point() +
geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")
水平线(即常数回归方程)只会扩展到每个方面的数据限制,但会跳过单独的数据摘要步骤。