考虑此矩阵Y
。
> (Y <- matrix(c(rep(1:4, each=2), rnorm(8)), 8))
[,1] [,2]
[1,] 1 -0.2452812
[2,] 1 1.3988440
[3,] 2 0.1558103
[4,] 2 0.2677039
[5,] 3 0.4716238
[6,] 3 -0.4442094
[7,] 4 1.9262647
[8,] 4 -0.9932708
我想用矩阵Y
第二列的相应值替换矩阵X
的第一列的值。
> (X <- matrix(c(1:4, 4, letters[c(2, 4, 3, 1, 1)]), 5))
[,1] [,2]
[1,] "1" "b"
[2,] "2" "d"
[3,] "3" "c"
[4,] "4" "a"
[5,] "4" "a"
我有这段代码在技术上适用于本示例和实际数据。
> cbind(Y, sapply(Y[, 1], function(x) unique(X[X[, 1] == x, 2])))
[,1] [,2] [,3]
[1,] "1" "-0.245281227293266" "b"
[2,] "1" "1.39884404912828" "b"
[3,] "2" "0.155810319624089" "d"
[4,] "2" "0.267703920057734" "d"
[5,] "3" "0.471623773960787" "c"
[6,] "3" "-0.444209371984632" "c"
[7,] "4" "1.92626472214693" "a"
[8,] "4" "-0.993270770582955" "a"
但是,由于我的真实数据要大得多,所以这似乎是一个很慢的解决方案;我真正的Y
是260244 x 10数据帧,该过程耗时超过12秒。
是否存在-并且通常 -一种更快的base R解决方案,用于使用数据帧Y
的相应值来重新编码数据帧X
的值?