我想在R中使用神奇的data.table包进行笛卡尔(全外部)连接。但是,我想要提到无法匹配的行,即我的两个data.tables" left"和"对"看起来像
key | data_left
1 | aaa
2 | bbb
3 | ccc
和
key | data_right
1 | xxx
2 | yyy
使用键列交叉连接"键"给了我
key | data_left | data_right
1 | aaa | xxx
2 | bbb | yyy
但是,完全缺少不匹配的行3 | ccc
。添加选项nomatch=0
(而不是nomatch=NA
)没有帮助。我希望data.table只用NA来填充剩余的列,所以我希望
key | data_left | data_right
1 | aaa | xxx
2 | bbb | yyy
3 | ccc | NA
知道我能做些什么才能让它发挥作用?
代码示例:
library(data.table)
left = data.table(keyCol = c(1,2,3), data_left = c("aaa", "bbb", "ccc"))
right = data.table(keyCol = c(1,2), data_right = c("xxx", "yyy"))
setkey(left, keyCol)
setkey(right, keyCol)
res0 = left[right, allow.cartesian=TRUE, nomatch=NA]
resNA = left[right, allow.cartesian=TRUE, nomatch=0]
答案 0 :(得分:1)
假设每keyCol
个值最多有一行,我会......
# setup
kc = "keyCol"
DTs = list(left, right)
# make main table with key col(s)
DT = unique(rbindlist(lapply(DTs, `[`, j = ..kc)))
# get non-key cols
for (d in DTs){
cols = setdiff(names(d), kc)
DT[d, on=kc, (cols) := mget(sprintf("i.%s", cols)) ][]
}
# cleanup loop vars
rm(d, cols)
这适用于更常见的情况......
kc
)和DTs
)。如果您希望将键列作为结果中的键,则代码会简化:
# make main table with key col(s)
DT = setkey(unique(rbindlist(lapply(DTs, `[`, j = ..kc))))
# get non-key cols
for (d in DTs){
cols = setdiff(names(d), kc)
DT[d, (cols) := mget(sprintf("i.%s", cols)) ][]
}