没有填充的堆积条形图?

时间:2019-06-01 21:22:17

标签: r ggplot2 dplyr geom-bar geom-col

我已经建立了一个数据集,并且正试图获取一个堆积的条形图来显示。

我的X轴将显示三个条形:“左”,“中”和“右”

我的Y轴将是与每个条形关联的“ total_completed_epa”。

唯一的问题是“ total_completed_epa”是一个突变变量,由我的数据集中其他两列的总和创建。我只想显示堆积柱形图在“ total_completed_epa”上每个列所占的比例。

数据为:


pass_location Air_Epa_Play YAC_EPA_Play Total_Completed_EPA
middle         0.263         0.434           0.697
left           0.086         0.439           0.525
right          0.082         0.442           0.524

抓取数据:

pass_epa <- pbp_2018 %>%
  filter(play_type %in% c("pass", "no_play", "qb_spike"),
         epa != is.na(epa)) %>%
  group_by (pass_location) %>%
  summarize(pass_epa = sum(epa),
            air_epa = sum(comp_air_epa),
            yac_epa = sum(comp_yac_epa),
            pass_plays = n ()) %>%
  ungroup() %>% 
  mutate(EPA_Play = round(pass_epa / pass_plays, 3),
         Air_Epa_Play = round(air_epa / pass_plays, 3), 
         YAC_EPA_Play = round(yac_epa / pass_plays, 3),
         Total_Completed_EPA = Air_Epa_Play + YAC_EPA_Play) %>%
  slice(-1) %>% 
  arrange(-EPA_Play) %>% 
  filter(pass_plays >= 80) %>%
  select(pass_location, Air_Epa_Play, YAC_EPA_Play, Total_Completed_EPA) 

所以我的数据集中只有4列。 Air_Epa_Play和YAC_EPA_Play相加为“ Total_Completed_EPA”。

可视化:

ggplot(pass_epa, aes(x = pass_location, y = Total_Completed_EPA, fill = ?)) +
  geom_col(position = "dodge") 

我只是无法将Air_EPA_Play和YAC_EPA_Play堆放在Total_Completed_EPA的条形图中

enter image description here

1 个答案:

答案 0 :(得分:1)

数据

test <- data.frame(pass_location=c('middle','left','right'), Air_Epa_Play=c(0.263,0.086,0.082), YAC_Epa_Play=c(0.434,0.439,0.442), Total_Completed=c(0.697,0.525,0.524))

  pass_location Air_Epa_Play YAC_Epa_Play Total_Completed
1        middle        0.263        0.434           0.697
2          left        0.086        0.439           0.525
3         right        0.082        0.442           0.524

您可以忽略Total_Completed列-select(-Total_Completed)ggplot为您进行累加/求和,因此您不必自己计算总数。但是,ggplot也喜欢长格式(而不是宽格式)的数据,因此您需要将gather()的相关值(在y轴上)分成一列。注意,我使用gather(..., -pass_location)来忽略分组列。在有和没有fill=var的情况下尝试以下操作。一旦您发现ggplot喜欢长格式的数据,使用它就会变得更加直观-至少对我而言是如此。

library(tidyverse)
test %>% 
  select(-Total_Completed) %>% 
  gather(var, value, -pass_location) %>% 
  ggplot(., aes(x=pass_location, y=value, fill=var)) + 
  geom_col()