根据每个班级的日期值累积行

时间:2019-06-17 16:31:25

标签: r

我有一些带有计数值(包含整数),日期列和标识列(包含10个不同值)的数据。 我想知道标识符何时达到计数值(例如100)中的值。因此,我想为每个标识符累计计数值(我不知道如何在R的第一部分中使用Data.table),然后做一个条件(当我的commulate列为>时) 100,我将1设为0)和一个选择。

对于累计部分,我不知道如何根据列值进行操作。

#◘ Exemple of data
data <-data.frame(identifiant = c("A","A","A","A","A","B","B","B"),
                  date = as.Date(c("01/01/2018","02/01/2018","03/01/2018","04/01/2018","08/01/2018","03/01/2018","04/01/2018","08/01/2018"),format = '%d/%m/%Y'),
                  count = c(25,39,50,41,10,3,95,2))



# I would like a cummulate column like this

identifiant date    count   Cummulate
       A    01/01/2018  25  25
       A    02/01/2018  39  64
       A    03/01/2018  50  114
       A    04/01/2018  41  155
       A    08/01/2018  10  165
       B    03/01/2018  3   3
       B    04/01/2018  95  98
       B    08/01/2018  2   100

感谢您的进阶

1 个答案:

答案 0 :(得分:3)

我们可以按'identifiant'分组并获得'count'的累积总和

library(dplyr)
data %>% 
   group_by(identifiant) %>% 
   mutate(Cummulate = cumsum(count))
# A tibble: 8 x 4
# Groups:   identifiant [2]
#  identifiant date       count Cummulate
#  <fct>       <date>     <dbl>     <dbl>
#1 A           2018-01-01    25        25
#2 A           2018-01-02    39        64
#3 A           2018-01-03    50       114
#4 A           2018-01-04    41       155
#5 A           2018-01-08    10       165
#6 B           2018-01-03     3         3
#7 B           2018-01-04    95        98
#8 B           2018-01-08     2       100