我有一个R函数,对于长度约为5000的向量,我需要计算大约一百万次。是否有可能通过在Rcpp中实现它来加快速度?之前我几乎没有使用过Rcpp,下面的代码也不起作用:
set.seet(1)
a <- rt(5e3, df = 2)
b <- rt(5e3, df = 2.5)
c <- rt(5e3, df = 3)
d <- rt(5e3, df = 3.5)
sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax)))
#[1] -367780.1
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double f_outer(NumericVector u, NumericVector v, NumericVector x, NumericVector y) {
double result = sum((1 - Rcpp::outer(u, v, Rcpp::pmax)) * (1 - Rcpp::outer(x, y, Rcpp::pmax)));
return(result);
}
非常感谢您!
答案 0 :(得分:4)
F. Privé是正确的-我们将在此处进行循环;我在文件so-answer.cpp
中有以下C ++代码:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double f_outer(NumericVector u, NumericVector v, NumericVector x, NumericVector y) {
// We'll use the size of the first and second vectors for our for loops
int n = u.size();
int m = v.size();
// Make sure the vectors are appropriately sized for what we're doing
if ( (n != x.size() ) || ( m != y.size() ) ) {
::Rf_error("Vectors not of compatible sizes.");
}
// Initialize a result variable
double result = 0.0;
// And use loops instead of outer
for ( int i = 0; i < n; ++i ) {
for ( int j = 0; j < m; ++j ) {
result += (1 - std::max(u[i], v[j])) * (1 - std::max(x[i], y[j]));
}
}
// Then return the result
return result;
}
然后我们在R中看到C ++代码给出的答案与您的R代码相同,但运行速度更快很多:
library(Rcpp) # for sourceCpp()
library(microbenchmark) # for microbenchmark() (for benchmarking)
sourceCpp("so-answer.cpp") # compile our C++ code and make it available in R
set.seed(1) # for reproducibility
a <- rt(5e3, df = 2)
b <- rt(5e3, df = 2.5)
c <- rt(5e3, df = 3)
d <- rt(5e3, df = 3.5)
sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax)))
#> [1] -69677.99
f_outer(a, b, c, d)
#> [1] -69677.99
# Same answer, so looking good. Which one's faster?
microbenchmark(base = sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax))),
rcpp = f_outer(a, b, c, d))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> base 3978.9201 4119.6757 4197.9292 4131.3300 4144.4524 10121.5558 100
#> rcpp 118.8963 119.1531 129.4071 119.4767 122.5218 909.2744 100
#> cld
#> b
#> a
由reprex package(v0.2.1)于2018-12-13创建