Question

我正在使用R，我想根据它们的组对列进行汇总，因此在此示例中，我将其中三个具有 high ， medium而不是十个列和 low 及其汇总值。如果这些是行，我会使用aggregate，但我不知道该如何使用列。

set.seed(4)
a<-matrix(runif(40),ncol=10,nrow=4)
colnames(a)<-letters[1:10]
a
               a         b          c         d         e
[1,] 0.585800305 0.8135742 0.94904022 0.1000535 0.9710557
[2,] 0.008945796 0.2604278 0.07314447 0.9540688 0.5839880
[3,] 0.293739612 0.7244059 0.75467503 0.4156071 0.9622046
[4,] 0.277374958 0.9060922 0.28600062 0.4551024 0.7617024
             f         g         h         i           j
[1,] 0.7145085 0.6491614 0.5137017 0.8779959 0.460025911
[2,] 0.9966129 0.8308064 0.5297775 0.6545220 0.622056487
[3,] 0.5062709 0.4819990 0.5671122 0.4823709 0.388418035
[4,] 0.4899432 0.8417462 0.2389489 0.9710298 0.006592727

type<-c("high","high","low","high","medium","high","medium","high","low","low")

Answer 1

我们可以复制type并将其用于tapply

tapply(a, type[col(a)], FUN = sum)
#    high       low    medium 
#10.352068  6.525872  6.082664

或者是按行排列的

sapply(split(seq_along(type), type), function(i) rowSums(a[, i]))
#         high      low   medium
#[1,] 2.727638 2.287062 1.620217
#[2,] 2.749833 1.349723 1.414794
#[3,] 2.507136 1.625464 1.444204
#[4,] 2.367462 1.263623 1.603449

或更紧凑

sapply(split.default(as.data.frame(a), type), rowSums)

或使用aggregate

aggregate(Freq ~ ., as.data.frame.table(`colnames<-`(a, type)), FUN = sum)

或者使用split将数据拆分为list个向量，并在list上循环以返回sum

sapply(split(a, type[col(a)]), sum)
#    high       low    medium 
#10.352068  6.525872  6.082664

按变量组汇总列

1 个答案: