随着时间的推移累积总和

时间:2019-09-11 02:37:27

标签: r dplyr tidyverse

我拥有每个赛季每个球员得分的数据:

playerID <- c(1,2,3,1,2,3,1,2,3,1,2,3)
year <- c(2002,2000,2000,2003,2001,2001,2000,2002,2002,2001,2003,2003)
goals <- c(25,21,27,31,39,34,42,44,46,59,55,53)
my_data <- data.frame(playerID, year, goals)

我想绘制一段时间内每个玩家的累计个进球数:

ggplot(my_data, aes(x=year, y=cumsum_goals, group=playerID)) + geom_line()

我尝试使用summarize中的dplyr,但这仅在数据已按year进行排序时有效(请参见播放器1):

new_data <- my_data %>%
  group_by(playerID) %>%
  mutate(cumsum_goals=cumsum(goals))

是否有办法使此代码对年份不按时间顺序排列的数据变得健壮?

1 个答案:

答案 0 :(得分:2)

我们可以用arrangeplayerID year,取cumsum然后作图

library(dplyr)
library(ggplot2)

my_data %>%
  arrange(playerID, year) %>%
  group_by(playerID) %>%
  mutate(cumsum_goals=cumsum(goals)) %>%
  ggplot() + aes(x=year, y= cumsum_goals, color = factor(playerID)) + geom_line()

enter image description here