Question

我刚刚使用Rcpp编写了一个新版本的ABCoptim软件包。大约30倍的加速，我对新版本的性能（与旧版本）非常满意，但我仍然有一些担心，如果我有空间来提高性能而不需要修改太多代码。

在ABCoptim的主要功能（用C ++编写）中，我传递一个包含＆＃34;蜜蜂位置的Rcpp :: List对象＆＃34; （NumericMatrix）和一些NumericVectors，包含算法本身的重要信息。我的问题是，当我将Rcpp :: List对象传递给其他函数时，例如

#include <Rcpp.h>

using namespace Rcpp;

List ABCinit([some input]){[some code here]};
void ABCfun2(List x){[some code here]};
void ABCfun3(List x){[some code here]};

List ABCmain([some input])
{
  List x = ABCinit([some input]);
  while ([some statement])
  {
    ABCfun2(x);
    ABCfun3(x);
  }
  ...

  return List::create(x["results"]);
}

Rcpp在while循环中做了什么？ x对象是通过引用还是通过深层复制传递给函数ABCfun2和ABCfun3？我已经看到了＆＃39; const List＆amp; x＆＃39;的用法，它告诉我可以使用指针传递Rcpp对象，但问题是我需要这个列表是可变的（并且没有常量），无论如何要改善这个？我担心这个x List的迭代拷贝可能会减慢我的代码。

PS：我还是C ++的新手，而且我还在使用Rcpp学习C ++。

Answer 1

除非您使用clone提出要求，否则Rcpp中没有深层副本。传递值时，您正在创建一个新的List对象，但它使用相同的底层R对象。

因此，传递值和传递参考之间的差异很小。

但是，当您通过值时，您必须再次为保护基础对象付出代价。它可能会产生额外的成本，因为这个Rcpp依赖于递归效率不高的R_PreserveObject。

我的指南是尽可能通过参考，以便您不会支付额外的保护价格。如果您知道ABCfun2无法更改对象，我建议您通过引用传递给const：ABCfun2( const List& )。如果您要对List进行更改，那么我建议您使用ABCfun2( List& )。

考虑以下代码：

#include <Rcpp.h>
using namespace Rcpp  ;

#define DBG(MSG,X) Rprintf("%20s SEXP=<%p>. List=%p\n", MSG, (SEXP)X, &X ) ;

void fun_copy( List x, const char* idx ){
    x[idx] = "foo" ;
    DBG( "in fun_copy: ", x) ;

}
void fun_ref( List& x, const char* idx ){
    x[idx] = "bar" ;
    DBG( "in fun_ref: ", x) ;
}


// [[Rcpp::export]]
void test_copy(){

    // create a list of 3 components
    List data = List::create( _["a"] = 1, _["b"] = 2 ) ;
    DBG( "initial: ", data) ;

    fun_copy( data, "a") ;
    DBG( "\nafter fun_copy (1): ", data) ;

    // alter the 1st component of ths list, passed by value
    fun_copy( data, "d") ;
    DBG( "\nafter fun_copy (2): ", data) ;


}

// [[Rcpp::export]]
void test_ref(){

    // create a list of 3 components
    List data = List::create( _["a"] = 1, _["b"] = 2 ) ;
    DBG( "initial: ", data) ;

    fun_ref( data, "a") ;
    DBG( "\nafter fun_ref (1): ", data) ;

    // alter the 1st component of ths list, passed by value
    fun_ref( data, "d") ;
    DBG( "\nafter fun_ref (2): ", data) ;


}

我所做的就是将一个列表传递给一个函数，更新它并打印一些关于指向底层R对象的指针和指向List对象（this）的指针的信息。

以下是我致电test_copy和test_ref时发生的情况的结果：

> test_copy()
           initial:  SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
       in fun_copy:  SEXP=<0x7ff97c26c278>. List=0x7fff5b909f30

after fun_copy (1):  SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
$a
[1] "foo"

$b
[1] 2

       in fun_copy:  SEXP=<0x7ff97b2b3ed8>. List=0x7fff5b909f20

after fun_copy (2):  SEXP=<0x7ff97c26c278>. List=0x7fff5b909fd0
$a
[1] "foo"

$b
[1] 2

我们从与R对象关联的现有列表开始。

           initial:  SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0

我们将其按值传递给fun_copy，因此我们得到一个新的List，但使用相同的基础R对象：

       in fun_copy:  SEXP=<0x7fda4926d278>. List=0x7fff5bb5ef30

我们退出fun_copy。再次使用相同的底层R对象，并返回到我们原来的List：

after fun_copy (1):  SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0

现在我们再次调用fun_copy，但这次更新了一个不在列表中的组件：x["d"]="foo"。

       in fun_copy:  SEXP=<0x7fda48989120>. List=0x7fff5bb5ef20

List别无选择，只能创建一个新的底层R对象，但此对象只是本地List的基础。因此，当我们退出get_copy时，我们会返回原始List及其原始基础SEXP。

after fun_copy (2):  SEXP=<0x7fda4926d278>. List=0x7fff5bb5efd0

这里的关键是第一次"a"已经在列表中，所以我们直接更新了数据。因为fun_copy的本地对象和test_copy的外部对象共享相同的底层R对象，所以fun_copy内的修改被传播。

第二次，fun_copy增长其本地List对象，将其与不会传播到外部函数的全新SEXP相关联。

现在考虑通过引用传递时会发生什么：

> test_ref()
           initial:  SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0
        in fun_ref:  SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0

  after fun_ref(1):  SEXP=<0x7ff97c0e0f80>. List=0x7fff5b909fd0
$a
[1] "bar"

$b
[1] 2

        in fun_ref:  SEXP=<0x7ff97b5254c8>. List=0x7fff5b909fd0

  after fun_ref(2):  SEXP=<0x7ff97b5254c8>. List=0x7fff5b909fd0
$a
[1] "bar"

$b
[1] 2

$d
[1] "bar"

只有一个List对象0x7fff5b909fd0。当我们必须在第二次调用中获得新的SEXP时，它会正确地传播到外层。

对我来说，通过引用传递的行为更容易理解。

Answer 2

简言之：

void ABCfun(List x)按值传递，但是再次List是一个包裹SEXP的Rcpp对象，这是一个指针 - 所以费用< em> here 小于C ++程序员所怀疑的，实际上它是轻量级的。（但正如罗曼正确指出的那样，额外的保护层会有成本。）

void ABCfun(const List x)承诺不会更改x，但又会因为它是一个指针......

void ABCfun(const List & x)看起来与C ++程序员最为正常，自去年以来在Rcpp中得到支持。

事实上，在Rcpp环境中，这三者大致相同。但是你应该按照最好的C ++实践来思考并且更喜欢3.有一天你可以使用std::list<....>而不是在这种情况下，const引用显然是优选的（Scott Meyers在中有关于此的完整帖子有效的C ++ （或者可能伴随着更有效的C ++ ）。

但最重要的一课是，你不应该只相信人们在互联网上告诉你的内容，而应该尽可能地衡量和描述。

Answer 3

我是Rcpp的新手，所以我想回答@Dirk要求对两种通过样式（复制和参考）的成本进行衡量的问题……

令人惊讶的是，这两种方法之间几乎没有什么区别。

我得到以下信息：

microbenchmark(test_copy(), test_ref(), times = 1e6)
Unit: microseconds
        expr   min    lq     mean median    uq        max neval cld
  test_copy() 5.102 5.566 7.518406  6.030 6.494 106615.653 1e+06   a
   test_ref() 4.639 5.566 7.262655  6.029 6.494   5794.319 1e+06   a

我使用了@Roman代码的简化版本：删除了DBG调用。

#include <Rcpp.h>
using namespace Rcpp;

void fun_copy( List x, const char* idx){
    x[idx] = "foo";
}

void fun_ref( List& x, const char* idx){
    x[idx] = "bar";
}

// [[Rcpp::export]]
List test_copy(){

    // create a list of 3 components
    List data = List::create( _["a"] = 1, _["b"] = 2);

    // alter the 1st component of the list, passed by value
    fun_copy( data, "a");

    // add a 3rd component to the list
    fun_copy( data, "d");
    return(data);

}

// [[Rcpp::export]]
List test_ref(){

    // create a list of 3 components
    List data = List::create( _["a"] = 1, _["b"] = 2);

    // alter the 1st component of the list, passed by reference
    fun_ref( data, "a");

    // add a 3rd component to the list
    fun_ref( data, "d");
    return(data);

}

/*** R

# benchmark copy v. ref functions
require(microbenchmark)
microbenchmark(test_copy(), test_ref(), times = 1e6)

*/

在C ++函数中，Rcpp对象如何传递给其他函数（通过引用或通过复制）？

3 个答案: