我试图在按ID分组的另一列中对值进行求和。一个ID可能具有不同的帐户,这些帐户在不同的日期打开。我想在每个帐户开立之前将金额加总,即少于每个帐户的开设日期。 here is the sample data。结果应该是This is the result。请注意,如果在打开帐户之前打开了任何帐户,sum_amount是金额的总和。 这是示例代码
id = c(1,1,1,2,2,2,2,3)
ac = c('a','z', 'k','d', 'g', 'f', 'w', 'h')
date_opened = c('2014-05-04','2014-03-01', '2014-06-01', '2014-04-01', '2014-06-01',
'2014-03-01', '2014-01-01', '2014-01-01')
amount = c(200, 300,100, 400, 200, 50, 100, 200)
data <- data.frame(id, ac, date_opened, amount)
答案 0 :(得分:0)
解决方案是订购日期并在每个id组上使用cumsum(总和)。这是数据表解决方案
data <- setDT(data)
data[,date_opened := as.Date(date_opened)]
setkey(data,date_opened)
data[,amountsum := cumsum(amount)- amount,by = id]
data[,.SD,by = id]
id ac date_opened amount amountsum
1: 2 w 2014-01-01 100 0
2: 2 f 2014-03-01 50 100
3: 2 d 2014-04-01 400 150
4: 2 g 2014-06-01 200 550
5: 3 h 2014-01-01 200 0
6: 1 z 2014-03-01 300 0
7: 1 a 2014-05-04 200 300
8: 1 k 2014-06-01 100 500
和dplyr解决方案
library(dplyr)
data %>%
group_by(as.factor(id)) %>%
arrange(date_opened) %>%
mutate(amountsum = cumsum(amount)-amount)