使用Rcpp / Armadillo和openMP并行与用户定义的函数

时间:2017-11-13 22:35:35

标签: r openmp rcpp

我试图用rcpp / armadillo和openmp来加速R中的循环。循环采用矩阵,每行包含位置向量的索引(或矩阵,如果它是2D位置)作为输入(和其他矩阵/ vec将被使用)。在循环内部,我提取每行输入索引矩阵并找到相应的位置,计算距离矩阵和协方差矩阵,做cholesky和backsolve,将backsolve结果保存到新的矩阵。这是rcpp代码:

`#include <iostream>
#include <RcppArmadillo.h>
#include <omp.h>
#include <Rcpp.h>

// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
using namespace arma;
using namespace std;

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
mat NZentries_new2 (int m, int nnp, const mat& locs, const umat& revNNarray, const mat& revCondOnLatent, const vec& nuggets, const vec covparms){
// initialized the output matrix
mat Lentries=zeros(nnp,m+1);
// initialized objects in parallel part
int n0; //number of !is_na elements
uvec inds;//
vec revCon_row;//
uvec inds00;//
vec nug;//
mat covmat;//
vec onevec;//
vec M;//
mat dist;//
int k;//
omp_set_num_threads(2);// selects the number of cores to use.
#pragma omp parallel for shared(locs,revNNarray,revCondOnLatent,nuggets,nnp,m,Lentries) private(k,M,dist,onevec,covmat,nug,n0,inds,revCon_row,inds00) default(none) schedule(static)
for (k = 0; k < nnp; k++) {
// extract a row to work with
inds=revNNarray.row(k).t();
revCon_row=revCondOnLatent.row(k).t();

if (k < m){
  n0=k+1;
} else {
  n0=m+1;
}
// extract locations
inds00=inds(span(m+1-n0,m))-ones<uvec>(n0);

nug=nuggets.elem(inds00) % (ones(n0)-revCon_row(span(m+1-n0,m))); // vec is vec, cannot convert to mat
dist=calcPWD2(locs.rows(inds00));

#pragma omp critical
{
  //calculate covariance matrix
  covmat= MaternFun(dist,covparms) + diagmat(nug) ; // summation from arma
}

// get last row of inverse Cholesky
onevec = zeros(n0);
onevec[n0-1] = 1;
M=solve(chol(covmat,"upper"),onevec);
// save the entries to matrix
Lentries(k,span(0,n0-1)) = M.t();
}
return Lentries;
}`

当前版本工作正常但速度很慢(几乎与没有并行版本相同),如果我将omp关键支架中的线路取出,则会导致段故障并且R将崩溃。这个MaterFun是我在下面定义的函数以及其他几个小函数。所以我的问题是为什么MaternFun必须留在关键部分。

// [[Rcpp::export]]
    mat MaternFun( mat distmat, vec covparms ){

      int d1 = distmat.n_rows;
      int d2 = distmat.n_cols;
      int j1;
      int j2;
      mat covmat(d1,d2);
      double scaledist;

      double normcon = covparms(0)/(pow(2.0,covparms(2)-1)*Rf_gammafn(covparms(2)));

      for (j1 = 0; j1 < d1; j1++){
        for (j2 = 0; j2 < d2; j2++){
          if ( distmat(j1,j2) == 0 ){
            covmat(j1,j2) = covparms(0);
          } else {
            scaledist = distmat(j1,j2)/covparms(1);
            covmat(j1,j2) = normcon*pow( scaledist, covparms(2) )*
              Rf_bessel_k(scaledist,covparms(2),1.0);
          }
        }
      }
      return covmat;
    }

    // [[Rcpp::export]]
    double dist2(double lat1,double long1,double lat2,double long2) {
      double dist = sqrt(pow(lat1 - lat2, 2) + pow(long1 - long2, 2)) ;
      return (dist) ;
    }
   // [[Rcpp::export]]
    mat calcPWD2( mat x) {//Rcpp::NumericMatrix
      int outrows = x.n_rows ;

      int outcols = x.n_rows ;
      mat out(outrows, outcols) ;
      for (int arow = 0 ; arow < outrows ; arow++) {
        for (int acol = 0 ; acol < outcols ; acol++) {
          out(arow, acol) = dist2(x(arow, 0),x(arow, 1),
                                  x(acol, 0),x(acol, 1)) ; //extract element from mat 
        }
      }
      return (out) ;
    }

以下是一些用于测试R中MaterFun的示例输入: library(fields) distmat=rdist(1:5) # distance matrix covparms=c(1,0.2,1.5)

1 个答案:

答案 0 :(得分:1)

问题是对 R 数学函数(Rf_bessel_kRf_gammafn)的两次调用要求访问是单线程而不是并行。

要解决此问题,请通过BHboost上添加依赖关系,以获取cyl_bessel_ktgamma功能。或者,始终可以选择在 C ++ 中重新实现 R besselKgamma,因此它无法使用单线程 R 变体。

这给出了:

#include <Rcpp.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>

// [[Rcpp::depends(BH)]]

// [[Rcpp::export]]
double besselK_boost(double x, double v) {
  return boost::math::cyl_bessel_k(v, x);
}

// [[Rcpp::export]]
double gamma_fn_boost(double x) {
  return boost::math::tgamma(x);
}

测试代码

x0 = 9.536743e-07
nu = -10
all.equal(besselK(x0, nu), besselK_boost(x0, nu))
# [1] TRUE

x = 2
all.equal(gamma(x), gamma_fn_boost(x))
# [1] TRUE

注意:boost变体的参数顺序与 R 不同:

cyl_bessel_k(v, x)
Rf_bessel_k(x, v, expon.scaled = FALSE)

从这里开始,我们可以修改MaternFun。不幸的是,由于缺少calcPWD2,我们可以走的最远的是切换到使用提升并合并到OpenMP protections

#include <RcppArmadillo.h>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>

#ifdef _OPENMP
#include <omp.h>
#endif

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(openmp)]]

// [[Rcpp::export]]
arma::mat MaternFun(arma::mat distmat, arma::vec covparms) {

  int d1 = distmat.n_rows;
  int d2 = distmat.n_cols;
  int j1;
  int j2;
  arma::mat covmat(d1,d2);
  double scaledist;

  double normcon = covparms(0) /
    (pow(2.0, covparms(2) - 1) * boost::math::tgamma(covparms(2)));

  for (j1 = 0; j1 < d1; ++j1){
    for (j2 = 0; j2 < d2; ++j2){
      if ( distmat(j1, j2) == 0 ){
        covmat(j1, j2) = covparms(0);
      } else {
        scaledist = distmat(j1, j2)/covparms(1);
        covmat(j1, j2) = normcon * pow( scaledist, covparms(2) ) *
          boost::math::cyl_bessel_k(covparms(2), scaledist);
      }
    }
  }
  return covmat;
}