我有一些带有计数值(包含整数),日期列和标识列(包含10个不同值)的数据。 我想知道标识符何时达到计数值(例如100)中的值。因此,我想为每个标识符累计计数值(我不知道如何在R的第一部分中使用Data.table),然后做一个条件(当我的commulate列为>时) 100,我将1设为0)和一个选择。
对于累计部分,我不知道如何根据列值进行操作。
#◘ Exemple of data
data <-data.frame(identifiant = c("A","A","A","A","A","B","B","B"),
date = as.Date(c("01/01/2018","02/01/2018","03/01/2018","04/01/2018","08/01/2018","03/01/2018","04/01/2018","08/01/2018"),format = '%d/%m/%Y'),
count = c(25,39,50,41,10,3,95,2))
# I would like a cummulate column like this
identifiant date count Cummulate
A 01/01/2018 25 25
A 02/01/2018 39 64
A 03/01/2018 50 114
A 04/01/2018 41 155
A 08/01/2018 10 165
B 03/01/2018 3 3
B 04/01/2018 95 98
B 08/01/2018 2 100
感谢您的进阶
答案 0 :(得分:3)
我们可以按'identifiant'分组并获得'count'的累积总和
library(dplyr)
data %>%
group_by(identifiant) %>%
mutate(Cummulate = cumsum(count))
# A tibble: 8 x 4
# Groups: identifiant [2]
# identifiant date count Cummulate
# <fct> <date> <dbl> <dbl>
#1 A 2018-01-01 25 25
#2 A 2018-01-02 39 64
#3 A 2018-01-03 50 114
#4 A 2018-01-04 41 155
#5 A 2018-01-08 10 165
#6 B 2018-01-03 3 3
#7 B 2018-01-04 95 98
#8 B 2018-01-08 2 100