我已经建立了一个数据集,并且正试图获取一个堆积的条形图来显示。
我的X轴将显示三个条形:“左”,“中”和“右”
我的Y轴将是与每个条形关联的“ total_completed_epa”。
唯一的问题是“ total_completed_epa”是一个突变变量,由我的数据集中其他两列的总和创建。我只想显示堆积柱形图在“ total_completed_epa”上每个列所占的比例。
数据为:
pass_location Air_Epa_Play YAC_EPA_Play Total_Completed_EPA
middle 0.263 0.434 0.697
left 0.086 0.439 0.525
right 0.082 0.442 0.524
抓取数据:
pass_epa <- pbp_2018 %>%
filter(play_type %in% c("pass", "no_play", "qb_spike"),
epa != is.na(epa)) %>%
group_by (pass_location) %>%
summarize(pass_epa = sum(epa),
air_epa = sum(comp_air_epa),
yac_epa = sum(comp_yac_epa),
pass_plays = n ()) %>%
ungroup() %>%
mutate(EPA_Play = round(pass_epa / pass_plays, 3),
Air_Epa_Play = round(air_epa / pass_plays, 3),
YAC_EPA_Play = round(yac_epa / pass_plays, 3),
Total_Completed_EPA = Air_Epa_Play + YAC_EPA_Play) %>%
slice(-1) %>%
arrange(-EPA_Play) %>%
filter(pass_plays >= 80) %>%
select(pass_location, Air_Epa_Play, YAC_EPA_Play, Total_Completed_EPA)
所以我的数据集中只有4列。 Air_Epa_Play和YAC_EPA_Play相加为“ Total_Completed_EPA”。
可视化:
ggplot(pass_epa, aes(x = pass_location, y = Total_Completed_EPA, fill = ?)) +
geom_col(position = "dodge")
我只是无法将Air_EPA_Play和YAC_EPA_Play堆放在Total_Completed_EPA的条形图中
答案 0 :(得分:1)
数据
test <- data.frame(pass_location=c('middle','left','right'), Air_Epa_Play=c(0.263,0.086,0.082), YAC_Epa_Play=c(0.434,0.439,0.442), Total_Completed=c(0.697,0.525,0.524))
pass_location Air_Epa_Play YAC_Epa_Play Total_Completed
1 middle 0.263 0.434 0.697
2 left 0.086 0.439 0.525
3 right 0.082 0.442 0.524
您可以忽略Total_Completed
列-select(-Total_Completed)
。 ggplot
为您进行累加/求和,因此您不必自己计算总数。但是,ggplot
也喜欢长格式(而不是宽格式)的数据,因此您需要将gather()
的相关值(在y轴上)分成一列。注意,我使用gather(..., -pass_location)
来忽略分组列。在有和没有fill=var
的情况下尝试以下操作。一旦您发现ggplot
喜欢长格式的数据,使用它就会变得更加直观-至少对我而言是如此。
library(tidyverse)
test %>%
select(-Total_Completed) %>%
gather(var, value, -pass_location) %>%
ggplot(., aes(x=pass_location, y=value, fill=var)) +
geom_col()