如何快速基于列表重新编码列值?

时间:2019-01-20 14:36:48

标签: r recode

考虑此矩阵Y

> (Y <- matrix(c(rep(1:4, each=2), rnorm(8)), 8))
     [,1]       [,2]
[1,]    1 -0.2452812
[2,]    1  1.3988440
[3,]    2  0.1558103
[4,]    2  0.2677039
[5,]    3  0.4716238
[6,]    3 -0.4442094
[7,]    4  1.9262647
[8,]    4 -0.9932708

我想用矩阵Y第二列的相应值替换矩阵X的第一列的值。

> (X <- matrix(c(1:4, 4, letters[c(2, 4, 3, 1, 1)]), 5))
     [,1] [,2]
[1,] "1"  "b" 
[2,] "2"  "d" 
[3,] "3"  "c" 
[4,] "4"  "a" 
[5,] "4"  "a"

我有这段代码在技术上适用于本示例和实际数据。

> cbind(Y, sapply(Y[, 1], function(x) unique(X[X[, 1] == x, 2])))
     [,1] [,2]                 [,3]
[1,] "1"  "-0.245281227293266" "b" 
[2,] "1"  "1.39884404912828"   "b" 
[3,] "2"  "0.155810319624089"  "d" 
[4,] "2"  "0.267703920057734"  "d" 
[5,] "3"  "0.471623773960787"  "c" 
[6,] "3"  "-0.444209371984632" "c" 
[7,] "4"  "1.92626472214693"   "a" 
[8,] "4"  "-0.993270770582955" "a" 

但是,由于我的真实数据要大得多,所以这似乎是一个很慢的解决方案;我真正的Y是260244 x 10数据帧,该过程耗时超过12秒。

是否存在-并且通常 -一种更快的base R解决方案,用于使用数据帧Y的相应值来重新编码数据帧X的值?

0 个答案:

没有答案