如何在R中按2列汇总表

时间:2015-07-12 18:59:08

标签: r aggregate split-apply-combine

我想通过将第1个按期间分组,然后按第2个按付款人ID进行分组来汇总此数据集,以便按月将结果显示为任意给定用户的小计,如下所示:

data.frame:

Payer   Period
1   10  1-1015
2   15  2-1015
3   14  3-1015
1   1   1-1015
3   5   1-1015
1   7   4-1015
3   8   4-1015
1   4   5-1015

结果应如下所示:

Payer   Period
1   11  1-1015
3   5   1-1015
2   15  2-1015
3   14  3-1015
1   7   4-1015
3   8   4-1015
1   4   5-1015

这是最好的方法吗?谢谢!

2 个答案:

答案 0 :(得分:4)

假设有三列,您可以aggregate

 aggregate(Amount~., df1, FUN=sum)
 #    Payer Period Amount
 #1     1 1-1015     11
 #2     3 1-1015      5
 #3     2 2-1015     15
 #4     3 3-1015     14
 #5     1 4-1015      7
 #6     3 4-1015      8
 #7     1 5-1015      4

或者

 library(data.table)#v1.9.5+
 setDT(df1)[, list(Amount=sum(Amount)), .(Period, Payer)]
 #    Period Payer Amount
 #1: 1-1015     1     11
 #2: 2-1015     2     15
 #3: 3-1015     3     14
 #4: 1-1015     3      5
 #5: 4-1015     1      7
 #6: 4-1015     3      8
 #7: 5-1015     1      4

使用不同的订单

 aggregate(Amount~., df2, FUN=sum)
 #  Payer Period Amount
 #1     1 1-1015     11
 #2     3 1-1015      5
 #3     2 2-1015     15
 #4     3 3-1015     14
 #5     1 4-1015      7
 #6     3 4-1015      8
 #7     1 5-1015      4

数据

 df1 <- structure(list(Payer = c(1L, 2L, 3L, 1L, 3L, 1L, 3L, 1L), 
 Amount = c(10L, 
 15L, 14L, 1L, 5L, 7L, 8L, 4L), Period = c("1-1015", "2-1015", 
 "3-1015", "1-1015", "1-1015", "4-1015", "4-1015", "5-1015")),
 .Names = c("Payer", 
  "Amount", "Period"), class = "data.frame", row.names = c(NA, -8L))

 set.seed(24)
 df2 <- df1[sample(nrow(df1)),]

答案 1 :(得分:1)

require(dplyr)
df %>% group_by(Period,Payer) %>%
    summarize(Amount = sum(Amount)) %>%
    ungroup() # this should ungroup by the last grouped var, i.e. Payer

# if that doesn't work, then add an explicit %>% arrange(Period, Payer)