Question

我有一个矩阵并寻找一种有效的方法来复制它n次（其中n是数据集中观察的数量）。例如，如果我有一个矩阵A

A <- matrix(1:15, nrow=3)

然后我想要一个

形式的输出

rbind(A, A, A, ...) #n times。

显然，有很多方法可以构建这样一个大矩阵，例如使用for循环或apply或类似函数。然而，对“矩阵复制 - 函数”的调用发生在我的优化算法的核心，在我的程序的一次运行中它被称为数万次。因此，循环，应用类型的函数和类似的东西都不够有效。（这样的解决方案基本上意味着n上的循环执行了数万次，这显然是低效的。）我已经尝试使用普通的rep函数，但还没有找到一种方法来安排在所需格式的矩阵中输出rep。

解决方案 do.call("rbind", replicate(n, A, simplify=F)) 因为rbind在这种情况下经常使用，所以效率也太低了。（然后，我的程序总运行时间的大约30％用于执行rbinds。）

有谁知道更好的解决方案？

Answer 1

另外两个解决方案：

第一个是对问题中的例子的修改

do.call("rbind", rep(list(A), n))

第二种方法是展开矩阵，复制矩阵并重新组装。

matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE)

由于效率是要求的，因此必须进行基准测试

library("rbenchmark")
A <- matrix(1:15, nrow=3)
n <- 10

benchmark(rbind(A, A, A, A, A, A, A, A, A, A),
          do.call("rbind", replicate(n, A, simplify=FALSE)),
          do.call("rbind", rep(list(A), n)),
          apply(A, 2, rep, n),
          matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE),
          order="relative", replications=100000)

给出：

                                                 test replications elapsed
1                 rbind(A, A, A, A, A, A, A, A, A, A)       100000    0.91
3                   do.call("rbind", rep(list(A), n))       100000    1.42
5  matrix(rep(t(A), n), ncol = ncol(A), byrow = TRUE)       100000    2.20
2 do.call("rbind", replicate(n, A, simplify = FALSE))       100000    3.03
4                                 apply(A, 2, rep, n)       100000    7.75
  relative user.self sys.self user.child sys.child
1    1.000      0.91        0         NA        NA
3    1.560      1.42        0         NA        NA
5    2.418      2.19        0         NA        NA
2    3.330      3.03        0         NA        NA
4    8.516      7.73        0         NA        NA

所以最快的是原始rbind调用，但假设n是固定的并且提前知道。如果n未修复，则最快的是do.call("rbind", rep(list(A), n)。这些是3x5矩阵和10次重复。不同大小的矩阵可能会有不同的排序。

编辑：

对于n = 600，结果的顺序不同（省略显式的rbind版本）：

A <- matrix(1:15, nrow=3)
n <- 600

benchmark(do.call("rbind", replicate(n, A, simplify=FALSE)),
          do.call("rbind", rep(list(A), n)),
          apply(A, 2, rep, n),
          matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE),
          order="relative", replications=10000)

给

                                                 test replications elapsed
4  matrix(rep(t(A), n), ncol = ncol(A), byrow = TRUE)        10000    1.74
3                                 apply(A, 2, rep, n)        10000    2.57
2                   do.call("rbind", rep(list(A), n))        10000    2.79
1 do.call("rbind", replicate(n, A, simplify = FALSE))        10000    6.68
  relative user.self sys.self user.child sys.child
4    1.000      1.75        0         NA        NA
3    1.477      2.54        0         NA        NA
2    1.603      2.79        0         NA        NA
1    3.839      6.65        0         NA        NA

如果您包含明确的rbind版本，则会比do.call("rbind", rep(list(A), n))版本略快，但速度不会快于apply或matrix版本。因此，在这种情况下，对任意n的推广不需要速度损失。

Answer 2

可能效率更高：

apply(A, 2, rep, n)

Answer 3

还有这种方式：

rep(1, n) %x% A

Answer 4

如何将其转换为数组，复制内容并创建具有更新行数的新矩阵？

A <- matrix(...)
n = 2 # just a test

a = as.integer(A)
multi.a = rep(a,n)
multi.A = matrix(multi.a,nrow=nrow(A)*n,byrow=T)

Answer 5

您可以使用索引

A[rep(seq(nrow(A)), n), ]

Answer 6

我来到这里的原因与原始海报相同，并最终更新了@Brian Diggs 比较以包含所有其他已发布的答案。希望我做对了：

#install.packages("rbenchmark")
library("rbenchmark")
A <- matrix(1:15, nrow=3)
n <- 600

benchmark(do.call("rbind", replicate(n, A, simplify=FALSE)),
          do.call("rbind", rep(list(A), n)),
          apply(A, 2, rep, n),
          matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE),
          A[rep(seq(nrow(A)), n), ],
          rep(1, n) %x% A,
          apply(A, 2, rep, n),
          matrix(rep(as.integer(t(A)),n),nrow=nrow(A)*n,byrow=TRUE),
     order="relative", replications=10000)

#                                                                test replications elapsed relative user.self sys.self user.child sys.child
#5                                          A[rep(seq(nrow(A)), n), ]        10000    0.32    1.000      0.33     0.00         NA        NA
#8 matrix(rep(as.integer(t(A)), n), nrow = nrow(A) * n, byrow = TRUE)        10000    0.36    1.125      0.35     0.02         NA        NA
#4                 matrix(rep(t(A), n), ncol = ncol(A), byrow = TRUE)        10000    0.38    1.188      0.37     0.00         NA        NA
#3                                                apply(A, 2, rep, n)        10000    0.59    1.844      0.56     0.03         NA        NA
#7                                                apply(A, 2, rep, n)        10000    0.61    1.906      0.58     0.03         NA        NA
#6                                                    rep(1, n) %x% A        10000    1.44    4.500      1.42     0.02         NA        NA
#2                                  do.call("rbind", rep(list(A), n))        10000    1.67    5.219      1.67     0.00         NA        NA
#1                do.call("rbind", replicate(n, A, simplify = FALSE))        10000    5.03   15.719      5.02     0.01         NA        NA

在R中有效地复制矩阵

6 个答案: