Question

我有一个问题，关于每个第n个元素的快速求和行。

考虑一个包含16列和m行的矩阵。结果应该有4列和m行，其中每列是每第n个元素的总和，即第一列是列的总和1,5,9,13，第二列是2,6,10,14 .. 。

目前我通过矩阵乘法实现了这一点。但是，对于大型矩阵，这需要太长时间。已发布的解决方案仅对连续的n个连续元素求和，而不是拆分。

/编辑：这就是我目前正在解决它的方式：

test <- matrix(c(1:24000),ncol=64)

SumFeatures <- function(ncol,nthElement) {
  ncolRes <- ncol/nthElement
  matrix(c(rep(diag(ncolRes),times = nthElement)),ncol = ncolRes,byrow = TRUE)
}

# Get Matrix to sum over every 4th element
sumMatrix <- SumFeatures(ncol(test),4)

system.time(test %*% sumMatrix)

有解决这个问题的快速解决方案吗？

亲切的问候。

Answer 1

使用从内置11 x data.frame m派生的矩阵anscombe作为输入：

# create test matrix m
m <- as.matrix(anscombe)

1）申请/ tapply 试试这个：

t(apply(m, 1, tapply, gl(4, 1, ncol(m)), sum))

，并提供：

          1     2     3     4
 [1,] 18.04 19.14 17.46 14.58
 [2,] 14.95 16.14 14.77 13.76
 [3,] 20.58 21.74 25.74 15.71
 [4,] 17.81 17.77 16.11 16.84
 [5,] 19.33 20.26 18.81 16.47
 [6,] 23.96 22.10 22.84 15.04
 [7,] 13.24 12.13 12.08 13.25
 [8,]  8.26  7.10  9.39 31.50
 [9,] 22.84 21.13 20.15 13.56
[10,] 11.82 14.26 13.42 15.91
[11,] 10.68  9.74 10.73 14.89

2）tapply 或者给出相同的结果：

do.call(cbind, tapply(1:ncol(m), gl(4, 1, ncol(m)), function(ix) rowSums(m[, ix])))

3）tapply - 2 或者这会给出类似的结果：

matrix(tapply(m, gl(4 * nrow(m), 1, length(m)), sum), nrow(m))

4）apply / array 或者另外要求在每个输出列中加入相同数量的输入列：

apply(array(m, c(nrow(m), 4, ncol(m) / 4)), 1:2, sum)

请注意，对于apply(array(m, c(11, 4, 2), 1:2, sum)，这只是m。

5）for 此备选方案基于for循环：

res <- 0
for(i in seq(1, ncol(m), 4)) res <- res + m[, seq(i, length = 4)]
res

通过将res设置为m [，1：4]然后以4 + 1开始i，可以加快速度，但代码变得有点丑陋，所以我们不会打扰。

6）减少

matrix(Reduce("+", split(m, gl(ncol(m) / 4, nrow(m) * 4))), nrow(m))

7）rowsum

t(rowsum(t(m), gl(4, 1, ncol(m))))

注意：以下测试的解决方案

（6），（5）和（4）是速度降序最快的（即（6）最快）。这三个也要求m的列数是4的偶数倍。（2）是最快的解决方案，不需要偶数倍后跟（3），（7）和（1））其中（1）是最慢的。
（7）是最短的，（1）是下一个最短的，（4）是第三个最短的

以下是基准：

library(rbenchmark)
benchmark(
  one = t(apply(m, 1, tapply, gl(4, 1, ncol(m)), sum)),
  two = do.call(cbind, 
         tapply(1:ncol(m), gl(4, 1, ncol(m)), function(ix) rowSums(m[, ix]))),
  three = matrix(tapply(m, gl(4 * nrow(m), 1, length(m)), sum), nrow(m)), 
  four = apply(array(m, c(nrow(m), 4, ncol(m) / 4)), 1:2, sum),
  five = {res <- 0
          for(i in seq(1, ncol(m), 4)) res <- res + m[, seq(i, length = 4)]
          res },
  six = matrix(Reduce("+", split(m, gl(ncol(m) / 4, nrow(m) * 4))), nrow(m)),
  seven = t(rowsum(t(m), gl(4, 1, ncol(m)))),
  order = "relative", replications = 1000)[1:4]

，并提供：

   test replications elapsed relative
6   six         1000    0.12    1.000
5  five         1000    0.18    1.500
4  four         1000    0.30    2.500
2   two         1000    0.31    2.583
3 three         1000    0.39    3.250
7 seven         1000    0.58    4.833
1   one         1000    2.27   18.917

Answer 2

根据我的经验，当您将问题减少到两个在内存中连续的一维数组之间的操作时，可以实现绝对最快的计算速度。这通常涉及重塑您的数据，这可能是一项昂贵的操作，但如果您需要多次重复计算，则会获得回报。

以11×8矩阵为例（与G.格洛腾迪克的解决方案相同），我会做

dim(m) <- c(44, 2)
out <- m[, 1] + m[, 2]
dim(out) <- c(11, 4)

请注意，重新整形数组时，t()和aperm()会复制数据，因此速度很慢，而更改dim属性的速度很快。

R - 每个第n个元素的快速rowum矩阵

2 个答案: