根据另一列的排名计算一列的累积总和

时间:2014-04-02 22:54:51

标签: r

我有一个如下所示的数据集:

amount    rank    category
4000      1       A
200       3       A
1000      2       A
10        4       A
500       1       B
...

我想根据amount的排序计算rank的累计总和,即返回:

cum      rank    category
4000     1       A
5000     2       A
5200     3       A
5210     4       A
...

任何帮助都会很好! :)

2 个答案:

答案 0 :(得分:1)

data.table解决方案:

require(data.table) ## version >= 1.9.0
setDT(dat)          ## converts data.frame to data.table by reference

setkey(dat, category, rank) ## sort first by category, then by rank
dat[, csum := cumsum(amount), by=category]

#    amount rank category csum
# 1:   4000    1        A 4000
# 2:   1000    2        A 5000
# 3:    200    3        A 5200
# 4:     10    4        A 5210
# 5:    500    1        B  500

答案 1 :(得分:0)

dplyr解决方案:

library(dplyr)

data = data.frame(amount = c(4000, 200, 1000, 10, 500),
                  rank = c(1, 3, 2, 4, 1),
                  category = c("A", "A", "A", "A","B"))

data %>% arrange(category, rank) %>% 
 group_by(category) %>% mutate(csum = cumsum(amount))