如何将每个类别的百分比添加到堆叠条形图(ggplot2)(用于“非百分比”堆叠图)

时间:2019-03-02 04:30:27

标签: r ggplot2

如何将每个类别的百分比添加到堆积的轴条形图中,而不是填充中。例如,我有以下数据集:

df<-structure(list(age_group = structure(c(3L, 3L, 5L, 3L, 5L, 5L, 
5L, 3L, 5L, 5L, 4L, 4L, 4L, 3L, 5L), .Label = c("65+", "55-64", 
"45-54", "35-44", "25-34", "18-24"), class = "factor"), Gender = c("F", 
"M", "M", "M", "F", "M", "M", "M", "F", "M", "M", "F", "M", "F", 
"M")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-15L), .Names = c("age_group", "Gender"))

dat <- aggregate(list(value = 1:NROW(df)), df[c("age_group", "Gender")], length)
dat$proportion <- ave(dat$value, dat$age_group, FUN = function(x) (x/sum(x)*100))
dat$proportionR <- round(dat$proportion, digits =0)

dat<-dat %>%
  group_by(age_group) %>%
  mutate(age_per = sum(value)) %>%
  ungroup() %>%
  mutate(age_per = round((age_per/sum(value))*100))

ggplot(dat, aes(x = age_group, y = value, fill = Gender)) +
  geom_col() + coord_flip() + ylab("Visits 2018-2019") +xlab("") +
  scale_fill_manual(values= c("#740404", "#AB6868", "#D5B3B3"), labels = c("Females", "Males", "N/A")) +
  theme(legend.title=element_blank()) +
  geom_text(aes(label = paste0(age_per, "%")), hjust = 2.7, position = "stack", color = "white", size =5)

enter image description here

我想要的是一种自动方法,可以从y轴为每个组添加总百分比,而无需考虑每个组中的百分比。我的工作流程确定了正确的百分比,但将其复制到堆栈中的每个子组上。我希望将geom_text放在小节结束后的空白处。

请注意,该问题不是以下SO Q-Adding percentage labels to a bar chart in ggplot2的重复部分,因为当每个条形图中有堆叠的组时,此问题处理的是百分比(前者仅用于条形图)。

此外,强调自动化。我可以执行以下操作,但是在我的真实数据集中,我有更多的年龄段间隔,这使得以下方法难以成立。

ggplot(dat, aes(x = age_group, y = value, fill = Gender)) +
  geom_col() + coord_flip() + ylab("Visits 2018-2019") +xlab("") +
  scale_fill_manual(values= c("#740404", "#AB6868", "#D5B3B3"), labels = c("Females", "Males", "N/A")) +
  theme(legend.title=element_blank()) +
  geom_text(aes(y= 5.2, x=1, label = "33%"), color = "#740404", size =5) +
  geom_text(aes(y= 3.2, x=2, label = "20%"), color = "#740404", size =5) +
  geom_text(aes(y= 7.2, x=3, label = "47%"), color = "#740404", size =5) 

enter image description here

1 个答案:

答案 0 :(得分:1)

考虑使用分组百分比计算进行注释。由于您需要将三个数字与六个序列相加,因此../../..可以与分组序列有所不同。另外,使用适当的性别和年龄组百分比。在另一个annotate调用下方,将替换您的base::ave

dplyr::group_by

Plot Output


对于动态注释,您可能必须使用agg_df <- aggregate(list(value = 1:NROW(df)), df[c("age_group", "Gender")], length) dat <- within(agg_df, { proportion <- ave(value, age_group, FUN = function(x) (x/sum(x)*100)) proportionR <- round(proportion, digits=0) age_per <- round((ave(value, age_group, Gender, FUN=sum) / sum(value)) * 100) grp_pct <- round((ave(value, age_group, FUN=sum) / sum(value)) * 100) }) dat # age_group Gender value grp_pct age_per proportionR proportion # 1 45-54 F 2 33 13 40 40.00000 # 2 35-44 F 1 20 7 33 33.33333 # 3 25-34 F 2 47 13 29 28.57143 # 4 45-54 M 3 33 20 60 60.00000 # 5 35-44 M 2 20 13 67 66.66667 # 6 25-34 M 5 47 33 71 71.42857 ggplot(dat, aes(x = age_group, y = value, fill = Gender)) + geom_col() + coord_flip() + ylab("Visits 2018-2019") +xlab("") + scale_fill_manual(values= c("#740404", "#AB6868", "#D5B3B3"), labels = c("Females", "Males", "N/A")) + theme(legend.title=element_blank()) + geom_text(aes(label = paste0(age_per, "%")), hjust = 2.7, position = "stack", color = "white", size =5) + annotate("text", x=1, y=5.25, label = paste0(dat$grp_pct[[1]], "%")) + annotate("text", x=2, y=3.25, label = paste0(dat$grp_pct[[2]], "%")) + annotate("text", x=3, y=7.25, label = paste0(dat$grp_pct[[3]], "%")) 来使用ggplot的功能形式,其中Reduce(实际上不是加算术运算符)被公开为+操作员。然后,调用+.gg()以遍历mapply以传递x坐标位置并注释标签。剩下的挑战是最佳y坐标是未知的。

unique(grp_pct)

Plot Output