在堆叠的ggplot2条形图中添加整个组的百分比

时间:2017-08-04 08:41:58

标签: r ggplot2 bar-chart

我正在尝试通过geom_text()在堆积的ggplot2条形图中添加组百分比,并在y轴上计算。我已经在这里看到并阅读了this question,但我认为它不会让我得到解决方案。

这是一个可重复的例子:

library(ggplot2)
library(scales)

df <- data.frame(Var1 = rep(c("A", "B", "C"), each = 3),
                 Var2 = rep(c("Gr1", "Gr2", "Gr3"), 3),
                 Freq = c(10, 15, 5, 5, 4, 3, 2, 10, 15))

ggplot(df) + aes(x = Var2, y = Freq, fill = Var1) +
  geom_bar(stat = "identity") +
  geom_text(aes(y = ..count.., label = scales::percent(..count../sum(..count..))),
            stat = "count")

结果如下:

stacked bar chart

只是为了确定你理解我想要的东西:我想要每个栏上方Gr1,Gr2,Gr3的百分比,总计达到100%。

基本上,这些是我在做的时候得到的值:

prop.table(tapply(df$Freq, df$Var2, sum))

谢谢!

1 个答案:

答案 0 :(得分:1)

我建议创建预先计算的data.frame。我会用dplyr来做,但你可以随意使用:

library('dplyr')

df2 <- df %>% 
  arrange(Var2, desc(Var1)) %>% # Rearranging in stacking order      
  group_by(Var2) %>% # For each Gr in Var2 
  mutate(Freq2 = cumsum(Freq), # Calculating position of stacked Freq
         prop = 100*Freq/sum(Freq)) # Calculating proportion of Freq

df2

# A tibble: 9 x 5
# Groups:   Var2 [3]
   Var1  Var2  Freq Freq2     prop
  <chr> <chr> <dbl> <dbl>    <dbl>
1     C   Gr1     2     2 11.76471
2     B   Gr1     5     7 29.41176
3     A   Gr1    10    17 58.82353
4     C   Gr2    10    10 34.48276
5     B   Gr2     4    14 13.79310
6     A   Gr2    15    29 51.72414
7     C   Gr3    15    15 65.21739
8     B   Gr3     3    18 13.04348
9     A   Gr3     5    23 21.73913

结果情节:

ggplot(data = df2,
       aes(x = Var2, y = Freq,
           fill = Var1)) +
  geom_bar(stat = "identity") +
  geom_text(aes(y = Freq2 + 1,
                label = sprintf('%.2f%%', prop)))

Resulting ploy

编辑:

好的,我误解了你一点。但我会使用相同的方法 - 根据我的经验,最好将大部分计算从ggplot中删除,这样就更容易预测。

df %>% 
  mutate(tot = sum(Freq)) %>% 
  group_by(Var2) %>% # For each Gr in Var2 
  summarise(Freq = sum(Freq)) %>% 
  mutate(Prop = 100*Freq/sum(Freq))

ggplot(data = df,
       aes(x = Var2, y = Freq)) +
  geom_bar(stat = "identity",
           aes(fill = Var1)) +
  geom_text(data = df2,
            aes(y = Freq + 1,
                label = sprintf('%.2f%%', Prop)))

新剧情: New plot