具有R的数据框中的个体的累积总和

时间:2013-03-14 02:29:13

标签: r

我有一个这样的数据框:

data=data.frame(ID=c("0001","0002","0003","0004","0004","0004","0001","0001","0002","0003"),Saldo=c(10,10,10,15,20,50,100,80,10,10),place=c("grocery","market","market","cars","market","market","cars","grocery","cars","cars"))

我试图计算应用cumsum或者应用的ID变量中每个人的aldo总和,但是我没有得到我想要的结果。我想要这样的人:

  ID      Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85 

2 个答案:

答案 0 :(得分:5)

您可以使用aggregate

> aggregate(Saldo ~ ID, data, function(x) max(cumsum(x))) ## same as sum
    ID Saldo
1 0001   190
2 0002    20
3 0003    20
4 0004    85

如果您真的对ID 累积总和感兴趣,请尝试以下操作:

within(data, {
  Saldo.Total <- ave(Saldo, ID, FUN = cumsum)
})
#     ID Saldo   place Saldo.Total
# 1  0001    10 grocery          10
# 2  0002    10  market          10
# 3  0003    10  market          10
# 4  0004    15    cars          15
# 5  0004    20  market          35
# 6  0004    50  market          85
# 7  0001   100    cars         110
# 8  0001    80 grocery         190
# 9  0002    10    cars          20
# 10 0003    10    cars          20

答案 1 :(得分:1)

我想你可能已经感到困惑,因为你想要的并不是累积总和,它只是一个总和:

library(plyr)
ddply(
  data,
  .(ID),
  summarize,
  Saldo.Total=sum(Saldo)
  )

输出:

    ID Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85

累积总和是沿着向量移动时的“运行总计”,例如:

> x = c(1, 2, 3, 4, 5)
> cumsum(x)
[1]  1  3  6 10 15