以下是数据设置:
require(data.table)
set.seed(42)
pos_mat <- data.table(c1 = sample(1:1000), c2 = sample(1:1000), c3 = sample(1:1000))
data <- data.table(value = rnorm(1000), other_stuff = rnorm(1000))
表格如下:
> pos_mat
c1 c2 V3
1: 915 849 990
2: 937 63 439
3: 286 819 699
4: 828 538 887
5: 640 498 831
996: 118 793 783
997: 777 670 617
998: 579 195 643
999: 351 728 221
1000: 834 742 244
和
> data
value other_stuff
1: -0.6013830 0.617336710
2: -0.1358161 -0.004541141
3: -0.9872728 -0.091256360
4: 0.8319250 0.399959375
5: -0.7950595 0.588901657
996: -0.3757455 0.264323016
997: -1.0417354 -1.355822276
998: 0.6976674 0.359071548
999: -0.1444488 -1.708252839
1000: 0.4985434 -0.635928277
现在pos_mat中的每个元素都响应数据中的行号。我想要一个与pos_mat具有相同尺寸的新data.table,但它没有行号,而是保存数据中的相应值。
即。 pos_mat [1,。(c1)]的值为915.在数据[915,。(value)] = 0.1702369中,我希望将其存储在新对象中。
我感觉像是这样:
new <- pos_mat
n <- nrow(pos_mat)
for(i in n) new[i,] <- data[unlist(pos_mat[1,]), value]
应该有效,但它一直告诉我尺寸是错误的。
答案 0 :(得分:2)
使用较小的数据集
require(data.table)
set.seed(42)
pos_mat <- data.table(c1 = sample(1:10), c2 = sample(1:10), c3 = sample(1:10))
data <- data.table(value = rnorm(10), other_stuff = rnorm(10))
如果您需要data.table
解决方案,可以使用set
并更新pos_dat
(或任何其他数据集),例如
for (j in names(pos_mat)) set(pos_mat, j = j, value = data[pos_mat[[j]], value])
pos_mat
# c1 c2 c3
# 1: 1.8951935 1.3201133 1.8951935
# 2: 1.2146747 -1.7813084 -0.2842529
# 3: -2.6564554 -0.1719174 -0.1719174
# 4: -0.3066386 -0.2842529 -1.7813084
# 5: -2.4404669 -2.6564554 0.6359504
# 6: -0.1719174 1.8951935 -2.6564554
# 7: 1.3201133 -2.4404669 1.2146747
# 8: 0.6359504 0.6359504 1.3201133
# 9: -0.2842529 -0.3066386 -0.3066386
# 10: -1.7813084 1.2146747 -2.4404669
或使用矩阵(使用新的pos_mat
数据集)
res <- data[unlist(pos_mat), value]
dim(res) <- dim(pos_mat)
res
# [,1] [,2] [,3]
# [1,] 1.8951935 1.3201133 1.8951935
# [2,] 1.2146747 -1.7813084 -0.2842529
# [3,] -2.6564554 -0.1719174 -0.1719174
# [4,] -0.3066386 -0.2842529 -1.7813084
# [5,] -2.4404669 -2.6564554 0.6359504
# [6,] -0.1719174 1.8951935 -2.6564554
# [7,] 1.3201133 -2.4404669 1.2146747
# [8,] 0.6359504 0.6359504 1.3201133
# [9,] -0.2842529 -0.3066386 -0.3066386
# [10,] -1.7813084 1.2146747 -2.4404669
两者都应该有效,但data.table
可能更有效率