高效的left_join和后续的合并

时间:2017-07-20 14:21:58

标签: r dplyr

我有以下数据:

library(dplyr)

a<-data.frame("one"=c(1:10),
              "two"=c("","","","","a","a","a","a","a","a"), stringsAsFactors = F)

b<-data.frame("one"=c(4,2,6),
              "two"=c("C","D","A"), stringsAsFactors = F)

我希望left_join b加入a,以便a$two只要b$two获得a$one == b$one的值。我喜欢这个:

left_join(a, b, by="one")

为了拥有与以前相同的结构,我们可以执行以下操作

left_join(a, b, by="one") %>% 
  mutate(two=ifelse(is.na(two.y), two.x, two.y)) %>% 
  select(-c(two.x, two.y))

这给了我想要的输出:

   one two
1    1    
2    2   D
3    3    
4    4   C
5    5   a
6    6   A
7    7   a
8    8   a
9    9   a
10  10   a

有没有办法执行left_join,以便mutateselect无需获得所需的输出?即,是否有更有效的方式来获得我想要的东西?现在,mutateselect

似乎很麻烦

1 个答案:

答案 0 :(得分:1)

如果我们正在寻找一个紧凑而有效的选项,那么可以使用data.table来实现。转换后的&#39; a&#39;到data.table,加入on&#39;一个&#39;并指定(:=)&#39; i.two&#39;即来自&#39; b&#39;的列。到了两个&#39; (来自&#39; a&#39;)

library(data.table)
setDT(a)[b,two := i.two , on = .(one)]
a
#     one two
# 1:   1    
# 2:   2   D
# 3:   3    
# 4:   4   C
# 5:   5   a
# 6:   6   A
# 7:   7   a
# 8:   8   a
# 9:   9   a
#10:  10   a