如何避免在Rcpp函数中从R环境读取数据

时间:2019-03-08 23:26:40

标签: r optimization rcpp

尽管MyCppFunction(NumericVector x)返回了所需的输出,但我不确定是否有适当/有效的方法来避免在不将变量作为函数参数传递的情况下读取变量myY上的数据。

我不将数据作为参数传递的原因是,我最终会将C ++函数作为目标函数传递以最小化,并且最小化例程仅接受一个参数的函数,即myX仅作为示例:在R中,我将通过以下方式将myY传递给optim(...)optim(par,fn=MyRFunction,y=myY)

对于如何从C ++函数中正确访问myY的任何建议,我们深表感谢,下面是一个最小的示例,我担心这是一个错误的方法:

更新:我已经修改了代码,以更好地反映上下文以及答案中提出的内容。以防万一,我的问题的重点是在这一行:NumericVector y = env["myY"]; // How to avoid this?

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double MyCppFunction(NumericVector x) {

  Environment env = Environment::global_env();
  NumericVector y = env["myY"];  // How to avoid this?

  double res = 0;

  for (int i = 0; i < x.size(); i++) res = res + (x(i) * y(i));

  return res;
}

double MyCppFunctionNoExport(NumericVector x) {

  Environment env = Environment::global_env();
  NumericVector y = env["myY"];  // How to avoid this?

  double res = 0;

  for (int i = 0; i < x.size(); i++) res = res + (x(i) * y(i));

  return res;
}

// [[Rcpp::export]]
double MyCppFunction2(NumericVector x, NumericVector y) {
  double res = 0;

  for (int i = 0; i < x.size(); i++) res = res + (x(i) * y(i));

  return res;
}

// [[Rcpp::export]]
double MyRoutine(NumericVector x, Function fn) {

  for (int i = 0; i < x.size(); i++) fn(x);

  return 0;
}

// [[Rcpp::export]]
double MyRoutineNoExport(NumericVector x) {

  for (int i = 0; i < x.size(); i++) MyCppFunctionNoExport(x);

  return 0;
}

/*** R
MyRFunction <- function(x, y=myY) {
  res = 0
  for(i in 1:length(x)) res = res + (x[i]*y[i])
  return (res)
}

callMyCppFunction2 <- function(x) {
   MyCppFunction2(x, myY)
}

set.seed(123456)

myY = rnorm(1e3)
myX = rnorm(1e3)

all.equal(MyCppFunction(myX), MyRFunction(myX), callMyCppFunction2(myX))

require(rbenchmark)

benchmark(MyRoutine(myX, fn=MyCppFunction),
          MyRoutine(myX, fn=MyRFunction),
          MyRoutine(myX, fn=callMyCppFunction2),
          MyRoutineNoExport(myX), order="relative")[, 1:4]

*/

输出

$ Rscript -e 'Rcpp::sourceCpp("stack.cpp")'
> MyRFunction <- function(x, y = myY) {
+     res = 0
+     for (i in 1:length(x)) res = res + (x[i] * y[i])
+     return(res)
+ }

> callMyCppFunction2 <- function(x) {
+     MyCppFunction2(x, myY)
+ }

> set.seed(123456)

> myY = rnorm(1000)

> myX = rnorm(1000)

> all.equal(MyCppFunction(myX), MyRFunction(myX), callMyCppFunction2(myX))
[1] TRUE

> require(rbenchmark)
Loading required package: rbenchmark

> benchmark(MyRoutine(myX, fn = MyCppFunction), MyRoutine(myX, 
+     fn = MyRFunction), MyRoutine(myX, fn = callMyCppFunction2), 
+     MyRoutineNoEx .... [TRUNCATED] 
                                     test replications elapsed relative
4                  MyRoutineNoExport(myX)          100   1.692    1.000
1      MyRoutine(myX, fn = MyCppFunction)          100   3.047    1.801
3 MyRoutine(myX, fn = callMyCppFunction2)          100   3.454    2.041
2        MyRoutine(myX, fn = MyRFunction)          100   8.277    4.892

3 个答案:

答案 0 :(得分:3)

使用两个参数并将C ++函数包装在R函数中。

0

R侧:

new_count

答案 1 :(得分:2)

optim确实允许传递其他变量。在这里,我们将f上的x最小化,并传递附加变量a

f <- function(x, a) sum((x - a)^2)
optim(1:2, f, a = 1)

给予:

$par
[1] 1.0000030 0.9999351

$value
[1] 4.22133e-09

$counts
function gradient 
      63       NA 

$convergence
[1] 0

$message
NULL

答案 2 :(得分:1)

另一种解决方案。在C空间中设置全局:

#include <Rcpp.h>
using namespace Rcpp;

static NumericVector yglobal;

// [[Rcpp::export]]
void set_Y(NumericVector y) {
  yglobal = y;
}

// [[Rcpp::export]]
double MyCppFunction(NumericVector x) {
  double res = 0;
  for (int i = 0; i < x.size(); i++) res = res + (x(i) * yglobal(i));
  return res;
}

R侧:

set.seed(123456)

myY = rnorm(1000)
set_Y(myY);
myX = rnorm(1000)

MyCppFunction(myX)

(注意:static的作用是将变量的范围限制为您的特定脚本)