Question

我从data.frame（即list list个data.frame中提取了vector，并希望在NumericVector中读取CharacterVector Rcpp进一步操纵。由于所有元素都是数字，我首先尝试将其读作data.frame。但是，指数发生了变化。然后，我尝试将其读作0 1 18 19 31 Freq Prob 1 1 3 10 10 1 6 0.12 2 1 5 1 1 1 1 0.02 3 10 3 10 8 10 2 0.04 4 10 7 10 9 10 1 0.02 5 10 9 10 10 10 2 0.04 6 2 3 2 6 2 1 0.02 7 3 3 2 2 3 1 0.02，保留原始订单。

原始> sapply(Model[[1]], mode) 0 1 18 19 31 Freq Prob "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" > sapply(Model[[1]], class) 0 1 18 19 31 Freq Prob "factor" "factor" "factor" "factor" "factor" "integer" "numeric"如下所示：

data.frame

鉴于：

结构（列表（`0` =结构（c（1L，1L，2L，2L，2L，3L，4L），。Label = c（“1”， “10”，“2”，“3”，“4”，“5”，“6”，“7”，“8”，“9”），class =“factor”）， `1` =结构（c（4L，6L，4L，8L，10L，4L，4L）,. Label = c（“1”， “10”，“2”，“3”，“4”，“5”，“6”，“7”，“8”，“9”），class =“factor”）， `18` =结构（c（2L，1L，2L，2L，2L，3L，3L）,. Label = c（“1”， “10”，“2”，“4”，“5”，“6”，“7”，“8”，“9”），class =“factor”）， `19` =结构（c（2L，1L，9L，10L，2L，7L，3L）,. Label = c（“1”， “10”，“2”，“3”，“4”，“5”，“6”，“7”，“8”，“9”），class =“factor”）， `31` =结构（c（1L，1L，2L，2L，2L，3L，4L）,. Label = c（“1”， “10”，“2”，“3”，“4”，“5”，“6”，“7”，“8”，“9”），class =“factor”）， Freq = c（6L，1L，2L，1L，2L，1L，1L），Prob = c（0.12,0.02， 0.04,0.02,0.04,0.02,0.02））,. Name = c（“0”，“1”，“18”， “19”，“31”，“Freq”，“Prob”），row.names = c（NA，7L），class =“data.frame”）

每列的模式和类别如下：

Rcpp

注意：第一行是CharacterVector中列出的列名，第二行是apply函数的结果。

将其NumericVector和// [[Rcpp::export]] //x is the dataframe, idx is column to read int dataframe1(DataFrame& x, int idx) { Rcpp::CharacterVector columnChar = x[idx]; Rcpp::NumericVector columnNum = x[idx]; Rcpp::Rcout << columnChar << std::endl; Rcpp::Rcout << columnNum << std::endl; return (0); }读入的dataframe1(Model[[1]],0) "1" "1" "10" "10" "10" "2" "3" "3" "3" "4" "4" "5" "5" "5" "6" "6" "6" "6" "6" "7" "7" "7" "8" "8" "9" 1 1 2 2 2 3 4 4 4 5 5 6 6 6 7 7 7 7 7 8 8 8 9 9 10如下：

NumericVector

输出如下：比如当R中的索引为1时，即Rcpp中的0，

NumericVector

如您所见，两个向量的顺序不同，AngularJS的顺序已被排序。但这只发生在因子列中，整数和数字列没有问题。

所以问题是如何在Rcpp中将因子读入Angular.bootstrap时保留顺序？

THX

Answer 1

Rcpp内部代表性有限factor。因此，您必须提前传入与每个因子关联的整数值。

这是区别的原因：

Rcpp::Rcout << columnChar << std::endl; // reading from factor label
Rcpp::Rcout << columnNum << std::endl; // reading from id associated with factor label

修改

要了解发生的事情，请考虑：

set.seed(133)
x = sample(1:10, 10, replace = F)
x

给出：

 [1]  6  8 10  3  2  4  7  9  5  1

这纯粹是数字。

现在，考虑一个因素：

xf = factor(x, labels = 11:20)

xf

，并提供：

[1] 16 18 20 13 12 14 17 19 15 11
Levels: 11 12 13 14 15 16 17 18 19 20

注意：x的值不再存在。而是通过映射到11到20之间的字符值来屏蔽它。这就是你在数字输出中看到重复的1和2但在字符输出中看到1和10的原因。

接下来，如果我们转换为数字，我们有：

as.numeric(xf)

，并提供：

[1]  6  8 10  3  2  4  7  9  5  1

或“分解”之前的原始值

获得实际水平：

as.numeric(as.character(xf))

返回：

[1] 16 18 20 13 12 14 17 19 15 11

编辑2：

要看到这一点，让我们修改原始功能：

#include <Rcpp.h>

// [[Rcpp::export]]
void dataframe_factors(Rcpp::DataFrame& x) { 
  Rcpp::CharacterVector factor_name = x[0];
  Rcpp::NumericVector factor_id = x[0];
  Rcpp::NumericVector numeric_val = x[1];
  Rcpp::Rcout << "FN: " << factor_name << std::endl;
  Rcpp::Rcout << "FID: " << factor_id << std::endl;

  // Numeric
  Rcpp::Rcout << "ORG: " << numeric_val << std::endl;

}


/*** R
set.seed(133)
x = sample(1:10, 10, replace = F)

xf = factor(x, labels = 11:20)

d = data.frame(xf, x)

dataframe_factors(d)
*/

给出：

FN: "16" "18" "20" "13" "12" "14" "17" "19" "15" "11"
FID: 6 8 10 3 2 4 7 9 5 1
ORG: 6 8 10 3 2 4 7 9 5 1

在rcpp

1 个答案:

修改

编辑2：