我有一个如下所示的数据框:
sub = c("X001","X001", "X001","X002","X002","X001","X002","X001","X002","X002","X002","X002")
revenue = c(20, 15, -10,-25,20,-20, 17,9,14,12, -9, 11)
df = data.frame(sub, revenue)
我想以这样的方式聚合它,即第二列应该显示sub的所有收入的总和,第三列应该显示绝对值的总和,第四列应该显示所有正值的总和和第五列应显示所有负值的总和。
结果应如下所示:
Sub All Sum Absolute Sum Positive Sum Negative Sum
X001 14 74 44 -30
X002 40 108 74 -34
我编写的代码计算了All sum:
y<-aggregate(df$revenue, by=list(Feature=x$Sub), FUN=sum)
如果在R中有更多知识的人帮我计算其他三列,我将非常感激。
答案 0 :(得分:3)
以下是如何使用dplyr执行此操作:
library(dplyr)
df%>%
group_by(sub)%>%
summarise(All_Sum=sum(revenue),Absolute_Sum=sum(abs(revenue)),
Positive_Sum=(sum(revenue[revenue>0])),Negative_Sum=(sum(revenue[revenue<0])))
sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
<fctr> <dbl> <dbl> <dbl> <dbl>
1 X001 14 74 44 -30
2 X002 40 108 74 -34
答案 1 :(得分:1)
在使用aggregate
的基础R中:
aggregate(.~sub, df, function(a) c(sum(a), sum(abs(a)), sum(a[a>0]), sum(a[a<0])))
# sub revenue.1 revenue.2 revenue.3 revenue.4
#1 X001 14 74 44 -30
#2 X002 40 108 74 -34
答案 2 :(得分:0)
我们也可以使用data.table
library(data.table)
setDT(df)[, .(All_Sum = sum(revenue), Absolute_Sum = sum(abs(revenue)),
Positive_Sum = sum(revenue[revenue>0]), Negative_Sum = sum(revenue[revenue<0])), by = sub]
# sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
#1: X001 14 74 44 -30
#2: X002 40 108 74 -34