我想从我的数据中创建分组的箱图:
X variable value
Cat1 Var1 10
Cat2 Var1 8
Cat3 Var1 7
Cat4 Var1 15
Cat1 Var2 4
Cat2 Var2 3
Cat3 Var2 4
Cat4 Var2 1
我能够通过以下方式检索它:
ggplot() +
geom_boxplot(aes(x=dataFiltered$X, y=dataFiltered$value, color=dataFiltered$variable))+
ylim(c(-5, 15))
现在我想添加额外的点,这些点将显示每个箱图的平均值(平均值)。我试过了:
ggplot() +
geom_boxplot(aes(x=dataFiltered$X, y=dataFiltered$value, color=dataFiltered$variable))+
ylim(c(-5, 15))+
geom_point(stat="identity", aes(x=means$`dataFiltered$X`, y=means$`dataFiltered$value`), col = "red",pch=18)
我尝试使用facet_wrap,但我无法纠正错误:
ggplot() +
geom_boxplot(aes(x=dataFiltered$X, y=dataFiltered$value, color=dataFiltered$variable))+
ylim(c(-5, 15))+
geom_point(stat="identity", aes(x=means$`dataFiltered$X`, y=means$`dataFiltered$value`), col = "red",pch=18) +
facet_wrap(~means$`dataFiltered$variable`, scales='free')
Error in layout_base... At least one layer must contain all variables used for facetting.
有没有办法在分组的箱形图上加上平均值?
答案 0 :(得分:2)
尝试添加stat_summary()
来电:
library(dplyr)
library(tidyr)
library(ggplot2)
df <- bind_rows(lapply(c(
"Cat1 Var1 10",
"Cat2 Var1 8",
"Cat3 Var1 7",
"Cat4 Var1 15",
"Cat1 Var2 4",
"Cat2 Var2 3",
"Cat3 Var2 4",
"Cat4 Var2 1"), data.frame))
colnames(df) <- "V1"
df2 <- df %>%
separate(V1, c("X", "variable", "value"), sep="\\s+") %>%
mutate(value = as.integer(value))
ggplot(df2, aes(x=X, y=value, color=variable)) +
geom_boxplot()+
ylim(c(-5, 15)) +
stat_summary(geom = "point", fun.y = "mean", colour = "red", size = 4)
如果您想要每个组,请尝试以下方法:
ggplot(df2, aes(x=X, y=value, color=variable)) +
geom_boxplot()+
ylim(c(-5, 15)) +
stat_summary(geom = "point", aes(group=variable, col=variable),
fun.y = "mean", size = 4, position=position_dodge(width=0.5))
当样本量较小时,这些图可能会产生误导。