多个列中每个类别的小计

时间:2019-01-10 22:13:42

标签: r datatable subtotal

我正在寻找一种有效的解决方案,以便为“ id”列中每个类别的新行中的每个列添加小计。我可以使用下面的代码来实现所需的输出,但是这种方法对于大型数据集而言效率不高。是否有可能使用数据表来实现?

谢谢!

data <- data.frame(id = c("a","b","a","b","c","c","c","a","a","b"),
               total = c(1,2,3,4,2,3,4,2,3,4),
               total2 = c(2,3,4,2,3,4,5,6,4,2),
               total3 = c(2,3,4,5,6,3,2,3,4,5))

data_new <- data.frame(id = character(), total = numeric(), total2 = 
numeric(), total3 = numeric())

for (i in unique(data$id)){

subset <- data[data$id == i,]

subtotals <- data.frame(id = i, total = sum(subset$total), total2 = 
sum(subset$total2), total3 = sum(subset$total3))

subset <- rbind(subset,subtotals)

data_new <- rbind(data_new, subset)


}

data_new 

3 个答案:

答案 0 :(得分:1)

这是一种整齐的风格方法:

let (|IsInt|IsString|) (x:obj) = match x with :? int -> IsInt | _ -> IsString

match someValue with
| IsInt -> true
| IsString -> false

reprex package(v0.2.1)于2019-01-10创建

答案 1 :(得分:1)

这是使用aggregate的基本R解决方案。感谢@thelatemail简化了原始版本。

SubTotals = aggregate(data[,2:4], data["id"], sum)
data_new = rbind(data, SubTotals)
data_new = data_new[order(data_new$id),]
data_new
   id total total2 total3
1   a     1      2      2
3   a     3      4      4
8   a     2      6      3
9   a     3      4      4
11  a     9     16     13
2   b     2      3      3
4   b     4      2      5
10  b     4      2      5
12  b    10      7     13
5   c     2      3      6
6   c     3      4      3
7   c     4      5      2
13  c     9     12     11

答案 2 :(得分:1)

下面是一个data.table解决方案:

library(data.table)
setDT(data)
rbind(data, data[, lapply(.SD,sum), by=id])[order(id)]
#    id total total2 total3
# 1:  a     1      2      2
# 2:  a     3      4      4
# 3:  a     2      6      3
# 4:  a     3      4      4
# 5:  a     9     16     13
# 6:  b     2      3      3
# 7:  b     4      2      5
# 8:  b     4      2      5
# 9:  b    10      7     13
#10:  c     2      3      6
#11:  c     3      4      3
#12:  c     4      5      2
#13:  c     9     12     11

by=的{​​{1}}组进行分组,然后通过idid以外的每个变量求和。然后将lapply(.SD,sum)返回到主集合,然后再rbindorder来填充行。