我正在尝试将数据表用作查找表:
> (dt <- data.table(myid=rep(11:12,3),zz=1:6,key=c("myid","zz")))
myid zz
1: 11 1
2: 11 3
3: 11 5
4: 12 2
5: 12 4
6: 12 6
> (id2name <- data.table(id=11:14,name=letters[1:4],key="id"))
id name
1: 11 a
2: 12 b
3: 13 c
4: 14 d
我想要的是
> (res <- data.table(myid=rep(11:12,3),zz=1:6,name=rep(letters[1:2],3),key=c("myid","zz")))
myid zz name
1: 11 1 a
2: 11 3 a
3: 11 5 a
4: 12 2 b
5: 12 4 b
6: 12 6 b
但是,我试过的联接失败了:
> dt[id2name]
Starting binary search ...done in 0 secs
Error in vecseq(f__, len__, if (allow.cartesian) NULL else as.integer(max(nrow(x), :
Join results in 8 rows; more than 6 = max(nrow(x),nrow(i)). Check for duplicate key values in i, each of which join to the same group in x over and over again. If that's ok, try including `j` and dropping `by` (by-without-by) so that j runs for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice.
Calls: [ -> [.data.table -> vecseq
我做错了什么?
PS。我可以通过任何其他方式获得结果;什么是最惯用的方式来做我想要的事情(dt
必须仍然是data.table
,但id2name
可以是将int映射到其他东西的任何东西 - 只要不假设int成为矢量索引。)
答案 0 :(得分:5)
> dt[id2name, allow.cartesian=T, nomatch=0]
myid zz name
1: 11 1 a
2: 11 3 a
3: 11 5 a
4: 12 2 b
5: 12 4 b
6: 12 6 b
data.table
正试图将您从自己身上拯救出来,以防您无意中加入具有重复值的键。请注意,如果您确定知道自己在做什么,错误消息(最终)会告诉您该怎么做。
可替换地:
> id2name[dt]
id name zz
1: 11 a 1
2: 11 a 3
3: 11 a 5
4: 12 b 2
5: 12 b 4
6: 12 b 6