我想合并以下数据表
dt1 <- data.table(id = letters[1:5], day = 1, var1 = c(2,5,8,7,9), var2 = c(5,5,8,6,7), key = "id")
dt2 <- data.table(id = letters[3:7], day = 2, var1 = c(1,7,6,6,3), var2 = c(2,3,3,2,1), key = "id")
并且结果应该包括每个id,每天。不幸的是,有些ID在几天内不存在。
id day var1 var2
a 1 2 5
a 2 NA NA
b 1 5 5
b 2 NA NA
c 1 8 8
c 2 1 2
d 1 7 6
d 2 7 3
我尝试将id
和day
设为DT
的密钥。通过以下几行,我无法为day
获取id
2,因为它实际上已丢失,并且变量加倍(var1.x var1.y)
merge(dt1, dt2, by= c("id","day"), all=TRUE)
merge(dt1, dt2, by= c("day","id"), all=TRUE)
allow.cartesian
也不起作用。任何人对如何获得我需要的决赛桌有任何想法/评论?
答案 0 :(得分:1)
尝试
library(data.table)
dcast(melt(rbind(dt1, dt2), id=c('id', 'day')),
id+day~variable, value.var='value', drop=FALSE)
# 1: a 1 2 5
# 2: a 2 NA NA
# 3: b 1 5 5
# 4: b 2 NA NA
# 5: c 1 8 8
# 6: c 2 1 2
# 7: d 1 7 6
# 8: d 2 7 3
# 9: e 1 9 7
#10: e 2 6 3
#11: f 1 NA NA
#12: f 2 6 2
#13: g 1 NA NA
#14: g 2 3 1
或者@BramVisser发表评论时,将rbind(dt1, dt2)
替换为rbindlist(list(dt1, dt2))
或者不使用melt/dcast
rbindlist(list(dt1, dt2))[, if(.N <2) .SD[c(.N, .N+1)] else .SD, id][,
day:=replace(day, is.na(day), setdiff(1:2,na.omit(day))) , id][]
或者
setkey(rbindlist(list(dt1, dt2)), id, day)[CJ(id=unique(id), day=unique(day))]