我正在尝试合并两个不同长度的datasets/datatable
,但我不断收到以下错误
Error in `[.data.table`(y, xkey, nomatch = ifelse(all.x, NA, 0), allow.cartesian = allow.cartesian)
:
retFirst must be integer vector the same length as nrow(i)
我无法理解错误消息的含义。有人可以帮忙吗? 我使用以下代码进行合并:
merge(x=Red,y=Error,by=c("loopN","TYPE"),all.x=TRUE)
datatable data:
DATA TABLE RED
TIME TYPE loopN diff
11/26/2014 0:45 28808 141126 0
11/26/2014 1:00 28808 141126 0
11/26/2014 1:15 28808 141126 0
11/26/2014 1:30 28808 141126 0
11/26/2014 1:15 189379 141126 0
11/26/2014 1:30 189379 141126 0
11/26/2014 2:15 189379 141126 0
11/26/2014 1:00 239188 141126 0
11/26/2014 1:15 239188 141126 0
11/26/2014 1:30 239188 141126 0
11/26/2014 13:30 239188 141126 0
DATA TABLE ERROR
loopN TYPE V1
141126 28808 -2.932
141126 28808 -2.932
141126 28808 -2.932
141126 28808 -2.932
141126 189379 1.061
141126 189379 -1.182
141126 189379 4.771
141126 239188 -0.163
141126 239188 -1.573
141126 239188 -1.981
141126 239188 -1.981

答案 0 :(得分:0)
你提出的建议似乎很奇怪。您的data.table RED
有4条TYPE=28808
和loopN=141126
条记录。同样,您的data.table ERROR
也有4条记录,其中包含TYPE
和loopN
的组合。因此合并(加入)将产生 16 记录。如果您的整个1MM记录data.table就是这样,那么结果将是巨大的。如果这真的是你想要的,那就可以了。
RED <- structure(list(TIME = c("11/26/2014 0:45", "11/26/2014 1:00", "11/26/2014 1:15", "11/26/2014 1:30", "11/26/2014 1:15", "11/26/2014 1:30", "11/26/2014 2:15", "11/26/2014 1:00", "11/26/2014 1:15", "11/26/2014 1:30", "11/26/2014 13:30"), TYPE = c(28808L, 28808L, 28808L, 28808L, 189379L, 189379L, 189379L, 239188L, 239188L, 239188L, 239188L), loopN = c(141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L), diff = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("TIME", "TYPE", "loopN", "diff"), class = "data.frame", row.names = c(NA, -11L))
ERROR <- structure(list(loopN = c(141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L, 141126L), TYPE = c(28808L, 28808L, 28808L, 28808L, 189379L, 189379L, 189379L, 239188L, 239188L, 239188L, 239188L), V1 = c(-2.932, -2.932, -2.932, -2.932, 1.061, -1.182, 4.771, -0.163, -1.573, -1.981, -1.981)), .Names = c("loopN", "TYPE", "V1"), class = "data.frame", row.names = c(NA, 11L))
# you start here...
library(data.table)
setkey(setDT(ERROR),loopN,TYPE)
setkey(setDT(RED),loopN,TYPE)
result <- ERROR[RED, allow.cartesian=TRUE]