如何合并data.tables保持列对映射固定?

时间:2015-06-25 20:45:48

标签: r merge data.table

我有两个data.table

>a <- data.table(code=c('FI', 'NO', 'SW'), name=c('Finland', 'Norway', 'Sweden'), category=c('A', 'B', 'C'), val_1=c(1,2,3))

>a
   code    name category val_1
1:   FI Finland        A     1
2:   NO  Norway        B     2
3:   SW  Sweden        C     3

> b <- data.table(code=c('FI', 'NO', 'FI'), category=c('A', 'B', 'C'), val_2=c(4,5,6))
> b
   code category val_2
1:   FI        A     4
2:   NO        B     5
3:   FI        C     6

如果我merge这些data.tables我得到了预期的输出

> merge(a, b, all=T, by=c('code', 'category'))
   code category    name val_1 val_2
1:   FI        A Finland     1     4
2:   FI        C      NA    NA     6
3:   NO        B  Norway     2     5
4:   SW        C  Sweden     3    NA

但是,我正在寻找的输出是

   code category    name val_1 val_2
1:   FI        A Finland     1     4
2:   FI        C Finland    NA     6
3:   NO        B  Norway     2     5
4:   SW        C  Sweden     3    NA

其中国家/地区名称取自a。我怎么能这样做?

2 个答案:

答案 0 :(得分:3)

我只需将代码到名称的映射分离出来,然后在需要时将其重新添加。

codemap <- a[,name,keyby=code]
a[,name:=NULL]

m <- merge(a,b,all=TRUE,by=c('code','category'))
#    code category val_1 val_2
# 1:   FI        A     1     4
# 2:   FI        C    NA     6
# 3:   NO        B     2     5
# 4:   SW        C     3    NA

setkey(m,NULL)
codemap[m]
#    code    name category val_1 val_2
# 1:   FI Finland        A     1     4
# 2:   FI Finland        C    NA     6
# 3:   NO  Norway        B     2     5
# 4:   SW  Sweden        C     3    NA

答案 1 :(得分:1)

您可以尝试

 merge(a, b, all=TRUE, by=c('code', 'category'))[,
              name := name[!is.na(name)][1L], code]
 #    code category    name val_1 val_2
 #1:   FI        A Finland     1     4
 #2:   FI        C Finland    NA     6
 #3:   NO        B  Norway     2     5
 #4:   SW        C  Sweden     3    NA