加快Rcpp中的命名向量

时间:2018-03-16 06:25:35

标签: r vector attributes rcpp names

我开始使用Rcpp并且能够令人难以置信地加速R代码。但是,更改矢量元素的名称(例如' v.attr("名称")= X'或' v.names()= X')的速度很慢我的手。有没有改进的解决方案?请参阅附图。

rcpp的样本; test_names.cpp

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector test_names(int N, bool name){

  RNGScope scope;
  NumericVector data = runif(N, 1, 100);

  if(name)data.attr("names")=seq(1,N);

  return data;
}

我在R

中得到的结果
> sourceCpp("./test_names.cpp")
> system.time(test_names(10000000, F))
   user  system elapsed 
  0.139   0.025   0.164 
> system.time(test_names(10000000, T))
   user  system elapsed 
 5.181   0.117   5.296 

谢谢。

2 个答案:

答案 0 :(得分:1)

我认为创造那么多字符串需要花费太多时间而且你无法做任何事情。请参阅以下比较:

> N <- 1e6

> system.time(test_names(N, FALSE))
   user  system elapsed 
  0.008   0.001   0.009 

> system.time(test_names(N, TRUE))
   user  system elapsed 
  0.244   0.001   0.246 

> system.time(setNames(test_names(N, FALSE), seq_len(N)))
   user  system elapsed 
  0.236   0.001   0.238 

> system.time(seq_len(N))
   user  system elapsed 
  0.000   0.000   0.001 

> system.time(as.character(seq_len(N)))
   user  system elapsed 
  0.228   0.000   0.229 

实际上转换为字符串很慢。

我通常不会使用名字;你为什么需要它们?

答案 1 :(得分:0)

从头到尾命名对象需要花费时间,因为@ F.Prive指出,将向量转换为字符非常昂贵。一般来说,命名对象被认为更方便 必要性以及何时需要速度&#34;用户面临选择,他们很乐意放弃这种便利。这就是为什么一些开发人员提供返回命名对象的选项。

尽管如此,还有其他选项来命名对象,所以让我们看看是否有任何选项加速了这个过程。

选项> begin * item.save! * rescue NoMethodError => e * puts e.backtrace * end (1.0ms) BEGIN (0.7ms) ROLLBACK /Users/user/.rvm/gems/ruby-2.4.1/gems/activemodel-5.1.4/lib/active_model/attribute_methods.rb:432:in `method_missing' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/persistence/base.rb:36:in `aasm_read_state' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/instance_base.rb:12:in `current_state' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/persistence/active_record_persistence.rb:157:in `aasm_invalid_state?' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/persistence/active_record_persistence.rb:149:in `block in aasm_validate_states' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/persistence/active_record_persistence.rb:147:in `each' /Users/user/.rvm/gems/ruby-2.4.1/gems/aasm-4.12.3/lib/aasm/persistence/active_record_persistence.rb:147:in `aasm_validate_states' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:413:in `block in make_lambda' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:197:in `block (2 levels) in halting' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:601:in `block (2 levels) in default_terminator' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:600:in `catch' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:600:in `block in default_terminator' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:198:in `block in halting' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `block in invoke_before' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `each' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `invoke_before' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:130:in `run_callbacks' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:827:in `_run_validate_callbacks' /Users/user/.rvm/gems/ruby-2.4.1/gems/activemodel-5.1.4/lib/active_model/validations.rb:405:in `run_validations!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activemodel-5.1.4/lib/active_model/validations/callbacks.rb:110:in `block in run_validations!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:131:in `run_callbacks' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:827:in `_run_validation_callbacks' /Users/user/.rvm/gems/ruby-2.4.1/gems/activemodel-5.1.4/lib/active_model/validations/callbacks.rb:110:in `run_validations!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activemodel-5.1.4/lib/active_model/validations.rb:335:in `valid?' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/validations.rb:65:in `valid?' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/validations.rb:82:in `perform_validations' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/validations.rb:50:in `save!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/attribute_methods/dirty.rb:43:in `save!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/transactions.rb:313:in `block in save!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/transactions.rb:384:in `block in with_transaction_returning_status' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/database_statements.rb:235:in `block in transaction' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/transaction.rb:194:in `block in within_new_transaction' /Users/user/.rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/transaction.rb:191:in `within_new_transaction' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/database_statements.rb:235:in `transaction' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/transactions.rb:210:in `transaction' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/transactions.rb:381:in `with_transaction_returning_status' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/transactions.rb:313:in `save!' /Users/user/.rvm/gems/ruby-2.4.1/gems/activerecord-5.1.4/lib/active_record/suppressor.rb:46:in `save!' (pry):4:in `<main>' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:355:in `eval' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:355:in `evaluate_ruby' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:323:in `handle_line' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:243:in `block (2 levels) in eval' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:242:in `catch' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:242:in `block in eval' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:241:in `catch' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_instance.rb:241:in `eval' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:77:in `block in repl' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:67:in `loop' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:67:in `repl' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:38:in `block in start' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/input_lock.rb:61:in `__with_ownership' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/input_lock.rb:79:in `with_ownership' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:38:in `start' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/repl.rb:15:in `start' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-0.10.4/lib/pry/pry_class.rb:169:in `start' /Users/user/.rvm/gems/ruby-2.4.1/gems/pry-nav-0.2.4/lib/pry-nav/pry_ext.rb:17:in `start_with_pry_nav' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/commands/console/console_command.rb:62:in `start' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/commands/console/console_command.rb:17:in `start' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/commands/console/console_command.rb:97:in `perform' /Users/user/.rvm/gems/ruby-2.4.1/gems/thor-0.20.0/lib/thor/command.rb:27:in `run' /Users/user/.rvm/gems/ruby-2.4.1/gems/thor-0.20.0/lib/thor/invocation.rb:126:in `invoke_command' /Users/user/.rvm/gems/ruby-2.4.1/gems/thor-0.20.0/lib/thor.rb:387:in `dispatch' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/command/base.rb:63:in `perform' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/command.rb:44:in `invoke' /Users/user/.rvm/gems/ruby-2.4.1/gems/railties-5.1.4/lib/rails/commands.rb:16:in `<top (required)>' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:292:in `require' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:292:in `block in require' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:258:in `load_dependency' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:292:in `require' /Users/user/Documents/user/project/project/bin/rails:9:in `<top (required)>' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:286:in `load' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:286:in `block in load' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:258:in `load_dependency' /Users/user/.rvm/gems/ruby-2.4.1/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:286:in `load' /Users/user/.rvm/rubies/ruby-2.4.1/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require' /Users/user/.rvm/rubies/ruby-2.4.1/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require' -e:1:in `<main>' => nil
有时,Base R中的基本功能(例如base R)可能非常有效。

names

结果如下:

// [[Rcpp::export]]
Rcpp::NumericVector test_namesBaseNameCpp(int N){
    Rcpp::RNGScope scope;
    return Rcpp::runif(N, 1, 100);
}

## in base R define the following:
test_namesBaseNameR <- function(n, named) {
    v <- test_namesBaseNameCpp(N = n)
    if (named)
        names(v) <- 1:n
    v
}

几乎一样。

选项将名称作为参数传递:
让我们看一下将名字作为参数传递是否有帮助。

library(microbenchmark)
microbenchmark(OP = test_names(10^5, T),
               baseR = test_namesBaseNameR(10^5, T),
               unit = "relative")
               Unit: relative
  expr      min       lq     mean   median       uq      max neval
    OP 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000   100
 baseR 1.008021 1.001896 1.015566 1.001912 1.016359 1.101542   100

这里也没有运气。

似乎如果我们真的想加快速度,我们将不得不根据需要定义一个名字向量和向量子集。例如,如果您知道所需命名对象的最大大小为// [[Rcpp::export]] Rcpp::NumericVector test_namesPassChar(int N, bool name, Rcpp::CharacterVector RNames) { Rcpp::RNGScope scope; Rcpp::NumericVector data = Rcpp::runif(N, 1, 100); if(name)data.attr("names")=RNames; return data; } microbenchmark(OP = test_names(10^5, T), passAndConvert = test_namesPassChar(10^5, T, 1:10^5), passAndConvert2 = test_namesPassChar(10^5, T, as.character(1:10^5)), unit = "relative") Unit: relative expr min lq mean median uq max neval OP 0.9878606 1.0015865 0.9820272 1.000169 0.9471021 0.8645719 100 passAndConvert 1.0000000 1.0000000 1.0000000 1.000000 1.0000000 1.0000000 100 passAndConvert2 1.0085636 0.9998202 1.0332923 1.000702 1.0517044 1.0507859 100 ,则在项目中,我们可以定义maxLength。现在,当我们通过myNames <- as.character(1:maxLength)时,我们也会通过N

选项预先制作的名称:

myNames[1:N]

这是一个很好的改进,但它仍然不是那么令人满意,因为它并不通用。

最后,有一个选项,我非常劝阻,我只是提到它,告诉你为什么maxLength <- 10^7 myNames <- as.character(1:maxLength) ## this steps takes a while, but you only do it once microbenchmark(OP = test_names(10^5, T), passPreMade = test_namesPassChar(10^5, T, myNames[1:10^5]), unit = "relative") Unit: relative expr min lq mean median uq max neval OP 8.149265 6.321411 6.617447 6.097933 6.042956 31.8545 100 passPreMade 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000 100 是如此美好。我们转向Rcpp,我们利用递增整数的结构来获得收益。 C需要一个SET_STRING_ELT数组,通常我们会通过chars将每个整数 i 转换为字符串,然后将temp = std::to_string(i)转换为char使用像temp这样的.c_str()函数。这又与前几个方法一样慢,因为我们将每个整数从头开始转换为myChar = temp.c_str()的数组。这种情况下的好消息是我们不需要这样做,因为我们只是用序列chars命名我们的向量。因此,对于90%的整数,只有1位数字正在变化(即一位数)。记住这一点,我们可以做这样的事情(N.B.这肯定可以改善,因为我的C技能不强):

选项我不推荐

1:n

测试输出:

#include <R.h>
#include "Rinternals.h"

// [[Rcpp::export]]
SEXP test_namesSuperHard(int N, bool name) {
    Rcpp::RNGScope scope;
    Rcpp::NumericVector data = Rcpp::runif(N, 1, 100);

    if (name) {
        SEXP myNames = PROTECT(Rf_allocVector(STRSXP, N));
        int base = (int) log10(N) + 1;
        char *myChar;
        myChar = (char *) malloc(base * sizeof(char));
        int count = 1, index = base - 1;

        for (std::size_t i = 0; i < base; i++)
            myChar[i] = '0';

        for (std::size_t i = 0; i < N; i++, count++) {
            if ((count % 10) == 0) {
                while (myChar[index] == '9') {
                    myChar[index] = '0';
                    index--;
                }
                count = 0;
                myChar[index]++;
                index = base - 1;
            } else {
                myChar[index] = count + '0';
            }
            SET_STRING_ELT(myNames, i, Rf_mkChar(myChar));
        }
        Rf_setAttrib(data, R_NamesSymbol, myNames);
        UNPROTECT(1);
    }
    return data;
}

现在我们进行基准测试:

test_namesSuperHard(20, TRUE)
       01        02        03        04        05 
13.417850 29.633416 35.221770 17.377710 97.139458 
       06        07        08        09        10 
24.187230 60.962993 23.307580 61.151013 12.892655 
       11        12        13        14        15 
48.303439 28.875226 37.264403 78.196955 29.705689 
      16        17        18        19        20 
3.533349 86.505015 62.784809 95.785053 94.273097

我们看到几乎microbenchmark(OP = test_names(10^5, T), dontRecommend = test_namesSuperHard(10^5, T), unit = "relative") Unit: relative expr min lq mean median uq max neval OP 3.753499 3.702676 3.885055 3.691977 3.504271 8.883482 100 dontRecommend 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 100 的改善速度很快。不错。然而,正如我在开幕式中所说的那样,这是一项很方便的工作。它还展示了4x团队(Dirk 等人)在使这些便利无法实施方面所做的真正惊人的工作。