Question

我试图模仿R函数，该函数允许基于索引向量运行列和行矩阵排列。如下面的代码所示：

m=matrix(sample(c(0:9),5*5,T),ncol=5,nrow=5)
diag(m)=0
rand=sample(c(1:5))
m[rand,rand]

我在c ++中尝试了以下代码：

Library(Rcpp)
cppFunction(‘
 NumericMatrix test(NumericMatrix& M, int col, IntegerVector& rand) {
  NumericMatrix M2(col,col);
  for(int a=0;a<col;a++){
    for(int b=a+1;b<col;b++){
       M2(b,a)=M(rand(b),rand(a));
      M2(a,b)=M(rand(a),rand(b));
    }
   }
   return M2;   
}
‘)

但它很慢：

microbenchmark::microbenchmark(test(m,5,(rand-1)),m2[rand,rand])

有什么想法可以加快这个过程吗？

Answer 1

使用更简单的循环：

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericMatrix test(NumericMatrix& M, int col, IntegerVector& rand) {
  NumericMatrix M2(col,col);
  for(int a=0;a<col;a++){
    for(int b=a+1;b<col;b++){
      M2(b,a)=M(rand(b),rand(a));
      M2(a,b)=M(rand(a),rand(b));
    }
  }
  return M2;   
}

// [[Rcpp::export]]
NumericMatrix test2(const NumericMatrix& M, const IntegerVector& ind) {

  int col = M.ncol();
  NumericMatrix M2(col, col);

  for (int j = 0; j < col; j++)
    for (int i = 0; i < col; i++)
      M2(i, j) = M(ind[i], ind[j]);

  return M2;   
}


/*** R
N <- 500
m <- matrix(sample(c(0:9), N * N, TRUE), ncol = N, nrow = N)
diag(m) <- 0
rand <- sample(N)

all.equal(test(m, ncol(m), rand - 1), m[rand, rand], test2(m, rand - 1))

microbenchmark::microbenchmark(
  test(m, ncol(m), rand - 1),
  m[rand, rand],
  test2(m, rand - 1)
)
*/

对于N = 5，R版本更快，但以纳秒为单位。例如，使用N = 500，您将获得：

Unit: microseconds
                       expr      min       lq     mean   median       uq      max neval
 test(m, ncol(m), rand - 1) 2092.474 2233.020 2843.145 2360.654 2548.050 7412.057   100
              m[rand, rand] 1422.352 1506.117 2064.500 1578.129 1718.345 6700.219   100
         test2(m, rand - 1)  698.595  769.944 1161.747  838.811  928.535 5379.841   100

Rcpp矩阵行列排列

1 个答案: