具有百分比且没有折叠数据的ggplot条形图?

时间:2019-10-23 12:39:50

标签: r ggplot2

我有一个看起来像这样的数据框:

data <- structure(list(Sex = c("Male", "Male", "Male", "Male", "Female", 
                               "Male", "Female", "Female", "Female", "Female", "Male", "Female", 
                               "Female", "Female", "Male"), Nationality = c("USA", "USA", "USA", 
                                                                            "UK", "UK", "UK", "France", "France", "France", "France", "France", 
                                                                            "USA", "Canada", "Canada", "Mexico")), row.names = c(NA, 15L), class = "data.frame")

我已经这样绘制了它:

ggplot(data, aes(x = factor(Nationality))) +  
  geom_bar(aes(y = (..count..)/sum(..count..), fill = Sex), width = 0.3) +
  scale_y_continuous(labels = percent, limits = c(0, 0.4))+
  coord_flip()

我想做两件事:

(1)以降序重新排列小节,以便第一个小节是计数最高的小节。我已经尝试过reorder,如关于stackoverflow的其他问题所示,但我无法使其正常工作。是因为我使用百分比吗?请注意,我不想使用图表中的计数总和,因为我仍然希望能够在绘图中表示性别(即,不得折叠数据)。我相信这个特殊问题以前没有得到答复。

(2)在每个小节内添加一个带有计数值的标签。我尝试了以下方法,但是没有用。问题是在这种情况下我不知道如何引用计数。

geom_text(aes(label = Nationality), nudge_y = +1)

注意。为了澄清我的意思是不折叠数据:我知道我可以变异并创建一个新的数据框,其中包含每个国籍的计数总和。但是之后我将失去每种性别的计数(数据将被折叠),因此我无法再在情节中表示性别。

1 个答案:

答案 0 :(得分:2)

这对您有用吗?

library(dplyr)
library(forcats)
library(scales)

data %>%
  # convert Nationality to factor with levels sorted according to 
  # each Nationality's total count, in reverse (i.e. descending) order
  mutate(Nationality = fct_rev(fct_infreq(Nationality))) %>%

  # aggregate by both Nationality & Sex, and calculate percentage
  count(Nationality, Sex) %>%
  mutate(p = n/sum(n)) %>%

  ggplot(aes(x = Nationality, y = p, label = n, fill = Sex)) +
  geom_col(width = 0.3) +
  geom_text(position = position_stack(vjust = 0.5)) +
  scale_y_continuous(labels = percent, limits = c(0, 0.4)) +
  coord_flip()

plot