如何标记ggridges包装中每个垃圾箱的数量?

时间:2020-10-27 00:15:55

标签: r ggridges

我有一个数据框,其中包含2列来模拟NFL赛季:球队和排名。我正在尝试使用ggridges绘制每个团队从1到10的频率分布图。我可以使该图正常工作,但我想显示每个箱中每个团队/等级的数量。到目前为止,我一直没有成功。

   ggplot(results, 
       aes(x=rank, y=team, group = team)) +
   geom_density_ridges2(aes(fill=team), stat='binline', binwidth=1, scale = 0.9, draw_baseline=T) +
   scale_x_continuous(limits = c(0,11), breaks = seq(1,10,1)) +
   theme_ridges() +
   theme(legend.position = "none") +
   scale_fill_manual(values = c("#4F2E84", "#FB4F14",  "#7C1415", "#A71930", "#00143F", "#0C264C", "#192E6C", "#136677", "#203731"), name = NULL)

哪个创建了这个图?

enter image description here

我尝试在此行中添加以将计数添加到每个垃圾箱中,但是没有用。

   geom_text(stat='bin', aes(y = team + 0.95*stat(count/max(count)),
                         label = ifelse(stat(count) > 0, stat(count), ""))) +

不是确切的数据集,但这至少足以运行原始图:

   results = data.frame(team = rep(c('Jets', 'Giants', 'Washington', 'Falcons', 'Bengals', 'Jaguars', 'Texans', 'Cowboys', 'Vikings'), 1000), rank = sample(1:20,9000,replace = T))

2 个答案:

答案 0 :(得分:4)

如何计算每个垃圾箱的数量,连接到原始数据并使用新变量n作为标签?

library(dplyr) # for count, left_join

results %>% 
  count(team, rank) %>% 
  left_join(results) %>% 
  ggplot(aes(rank, team, group = team)) +
  geom_density_ridges2(aes(fill = team), 
                       stat = 'binline', 
                       binwidth = 1, 
                       scale = 0.9, 
                       draw_baseline = TRUE) +
  scale_x_continuous(limits = c(0, 11), 
                     breaks = seq(1, 10, 1)) +
  theme_ridges() +
  theme(legend.position = "none") +
  scale_fill_manual(values = c("#4F2E84", "#FB4F14",  "#7C1415", "#A71930", "#00143F",
                               "#0C264C", "#192E6C", "#136677", "#203731"), name = NULL) +
  geom_text(aes(label = n), 
            color = "white", 
            nudge_y = 0.2)

结果:

enter image description here

答案 1 :(得分:1)

Neilfws的回答很好,但是我总是发现geom_ridgeline在这种情况下很难使用,因此我通常使用geom_rect重新创建它们:

library(dplyr)

results %>%
  count(team, rank) %>%
  filter(rank<=10) %>%
  mutate(team=factor(team)) %>%
  ggplot() +
  geom_rect(aes(xmin=rank-0.5, xmax=rank+0.5, ymin=team, fill=team,
                ymax=as.numeric(team)+n*0.75/max(n))) +
  geom_text(aes(x=rank, y=as.numeric(team)-0.1, label=n)) +
  theme_ridges() +
  theme(legend.position = "none") +
  scale_fill_manual(values = c("#4F2E84", "#FB4F14",  "#7C1415", "#A71930", 
                               "#00143F", "#0C264C", "#192E6C", "#136677", 
                               "#203731"), name = NULL) +
  ylab("team")

As requested

我特别喜欢从geom_rect而不是山脊线得到的精细控制水平。但是,您的确失去了围绕每个山脊线绘制的漂亮边界线的功能,因此,如果这很重要,请选择其他答案。