Question

似乎stat_bin中的密度图对因子变量没有预期效果。 y轴上的每个类别的密度为1。

例如，使用钻石数据：

diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ]
ggplot(diamonds_small, aes(x = cut)) +  stat_bin(aes(y=..density.., fill=cut))

enter image description here

我知道我可以使用

stat_bin(aes(y=..count../sum(..count..), fill=cut))

让它发挥作用。但是，根据stat_bin的文档，它应该适用于分类变量。

Answer 1

您可以手动设置group美学，从而获得您可能想要的内容。

ggplot(diamonds_small, aes(x = cut)) +  stat_bin(aes(y=..density..,group=1))

但是，您无法在群组中轻松填写。您可以自己汇总数据：

library(plyr)
ddply(diamonds_small,.(cut),
         function(x) data.frame(dens=nrow(x)/nrow(diamonds_small)))
ggplot(dd_dens,aes(x=cut,y=dens))+geom_bar(aes(fill=cut),stat="identity")

汇总步骤的略微更紧凑版本：

as.data.frame.table(prop.table(table(diamonds_small$cut)))

使用stat_bin中的密度和因子变量

1 个答案: