在列中查找起始值,然后递增1直到最后一年

时间:2019-01-29 14:22:46

标签: r min

我试图找到虚拟变量的起始值并将其增加1。

以下是示例数据:

id = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4) 
date = c(2010,2011,2012,2013,2014,2010,2011,2012,2013,2014,2010,2011,2012,2013,2014,2010,2011,2012,2013,2014) 
income = c(100,20,45,50,70,45,66,21,45,234,124,5325,645,23234,2352,456,24234,34656,5633,13524) 
participation = c(0,0,0,1,0,1,1,1,0,0,1,0,1,0,1,0,0,0,1,1) 
df <- data.frame(id,date,income,participation)

为了描述这些数据,我为每个人的收入和参与从2010年到2014年的活动创建了一个纵向数据。我试图查看参与度对他们的加班收入的影响。我正在描绘的是以下内容:

id = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4) 
date = c(2010,2011,2012,2013,2014,2010,2011,2012,2013,2014,2010,2011,2012,2013,2014,2010,2011,2012,2013,2014) 
income = c(100,20,45,50,70,45,66,21,45,234,124,5325,645,23234,2352,456,24234,34656,5633,13524) 
participation = c(0,0,0,1,2,1,2,3,4,5,1,2,3,4,5,0,0,0,1,2) 
df <- data.frame(id,date,income,participation)

老实说我迷失了,因为参与价值已经是一个虚拟变量。有没有办法将小组参与和约会一起产生增量?任何想法都会有所帮助。谢谢!

1 个答案:

答案 0 :(得分:4)

按“ id”分组后,获得“参与” cummax,然后对其进行累加总和

library(dplyr)
df %>% 
    group_by(id) %>% 
    mutate(participation = cumsum(cummax(participation)))