在R中联接多个数据框

时间:2020-09-07 20:55:10

标签: r dplyr data.table

a1 <- data.frame(id=c(1,1,1,1,2,2,2,3,3),
                var=c("6402","1","1","3","6406","6406","2","1","1"))
b1 <- data.frame(var=c("6402","6406"),
                txt=c("A","B"))
n1 <- data.frame(id=c(1,2,3))


desired <- data.frame(id=c(1,2,3),
                      txt=c("A","B",NA))

我如何加入a1,b1和n1来生成所需的df?

3 个答案:

答案 0 :(得分:2)

使用下一个代码,您可以获得类似于desired数据框的内容:

#Code
merge(n1,merge(a1,b1)[,-1],all.x = T)[!duplicated(merge(n1,merge(a1,b1)[,-1],all.x = T)[,'id']),]

输出:

  id  txt
1  1    A
2  2    B
4  3 <NA>

答案 1 :(得分:2)

这是使用嵌套merge

的基本R选项
merge(n1,
  merge(unique(a1),
    b1,
    by = "var",
    all.y = TRUE
  ),
  by = "id",
  all = TRUE
)[c("id", "txt")]

给出

  id  txt
1  1    A
2  2    B
3  3 <NA>

答案 2 :(得分:2)

我们可以使用tidyverse

library(dplyr)
distinct(a1) %>%
   left_join(b1, by = 'var') %>% 
   full_join(n1) %>%
   group_by(id) %>%
   summarise(txt = first(txt))
# A tibble: 3 x 2
#     id txt  
#  <dbl> <chr>
#1     1 A    
#2     2 B    
#3     3 <NA> 

或使用data.table

unique(setDT(a1))[b1,  txt := txt, on = .(var)][n1, .SD[1],
      on = .(id), by = .EACHI][, var := NULL][]