R中不同类型的聚合

时间:2017-04-23 22:33:30

标签: r dataframe sum aggregate

我有一个如下所示的数据框:

sub = c("X001","X001", "X001","X002","X002","X001","X002","X001","X002","X002","X002","X002") 
revenue = c(20, 15, -10,-25,20,-20, 17,9,14,12, -9, 11)

df = data.frame(sub, revenue)

我想以这样的方式聚合它,即第二列应该显示sub的所有收入的总和,第三列应该显示绝对值的总和,第四列应该显示所有正值的总和和第五列应显示所有负值的总和。

结果应如下所示:

Sub     All Sum       Absolute Sum       Positive Sum       Negative Sum

X001     14            74                 44                 -30
X002     40            108                74                 -34

我编写的代码计算了All sum:

y<-aggregate(df$revenue, by=list(Feature=x$Sub), FUN=sum)

如果在R中有更多知识的人帮我计算其他三列,我将非常感激。

3 个答案:

答案 0 :(得分:3)

以下是如何使用dplyr执行此操作:

library(dplyr)
df%>%
group_by(sub)%>%
summarise(All_Sum=sum(revenue),Absolute_Sum=sum(abs(revenue)),
          Positive_Sum=(sum(revenue[revenue>0])),Negative_Sum=(sum(revenue[revenue<0])))

     sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
  <fctr>   <dbl>        <dbl>        <dbl>        <dbl>
1   X001      14           74           44          -30
2   X002      40          108           74          -34

答案 1 :(得分:1)

在使用aggregate的基础R中:

aggregate(.~sub, df, function(a) c(sum(a), sum(abs(a)), sum(a[a>0]), sum(a[a<0])))

#  sub revenue.1 revenue.2 revenue.3 revenue.4
#1 X001        14        74        44       -30
#2 X002        40       108        74       -34

答案 2 :(得分:0)

我们也可以使用data.table

library(data.table)
setDT(df)[, .(All_Sum = sum(revenue), Absolute_Sum =  sum(abs(revenue)),
   Positive_Sum = sum(revenue[revenue>0]), Negative_Sum = sum(revenue[revenue<0])), by = sub]  
#    sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
#1: X001      14           74           44          -30
#2: X002      40          108           74          -34