boxplot:按每组子集的平均值排序

时间:2015-03-23 22:43:53

标签: r ggplot2 boxplot

让我们考虑这些数据:

df = data.frame('score'=round(runif(15, 1, 10)),
                'group'=paste0("a",rep(c(1,2,3),each=5)),
                'category'=rep(c("big", "big", "big", "big", "small"), 3))

我想用ggplot2绘制这些数据的箱线图。我想要的是:boxplot(得分〜组),但是根据" big"的平均值排列的箱形图。每个人的个人。

我无法在不创建新变量的情况下以简单的方式解决问题。好的,可以使用Dplyr。感谢。

1 个答案:

答案 0 :(得分:2)

我不知道这是否有资格作为一种简单的方式,我个人认为这很简单,但我使用dplyr来找到方法:

#find the means for each group
library(dplyr)
means <-
df %>%
  #filter out small since you only need category equal to 'big'
  filter(category=='big') %>%
  #use the same groups as in the ggplot
  group_by(group) %>%
  #calculate the means
  summarise(mean = mean(score))

#order the groups according to the order of the means
myorder <- means$group[order(means$mean)]

在这种情况下,订单是:

> myorder
[1] a1 a2 a3

为了按照上面的说明安排箱图的顺序,你只需要这样做:

library(ggplot2)
ggplot(df, aes(group, score)) +
  geom_boxplot() +
  #you just need to use scale_x_discrete with the limits argument
  #to pass in details of the order of appearance for the boxplots
  #in this case the order is the myorders vector
  scale_x_discrete(limits=myorder)

就是这样。

enter image description here