合并/加入R

时间:2018-04-03 22:27:34

标签: r merge dplyr tidyr

我有两个数据框ab我想要合并

a <- data.frame(g=c("1","2","2","3","3","3","4","4","4","4"),h=c("1","1","2","1","2","3","1","2","3","4"))

b <- data.frame(g=c("1","2","3","3","3","4","4","4","4","4"),i=c("1","2","3","2","1","2","3","4","5","6"))

g代表一个分组变量,hi代表我要合并/加入的

> a
   g h
1  1 1
2  2 1
3  2 2
4  3 1
5  3 2
6  3 3
7  4 1
8  4 2
9  4 3
10 4 4

> b
   g i
1  1 1
2  2 2
3  3 3
4  3 2
5  3 1
6  4 2
7  4 3
8  4 4
9  4 5
10 4 6

ab应合并在分组变量g的级别上,而hi的相同值应放在一起(独立)它们出现在h / i)中的顺序和不相同的值应该组合一次(不是所有可能的组合)。

最终df看起来像是:

   g    h    i
1  1    1    1
2  2    1 <NA>
3  2    2    2
4  3    1    1
5  3    2    2
6  3    3    3
7  4    1 <NA>
8  4    2    2
9  4    3    3
10 4    4    4
11 4 <NA>    5
12 4 <NA>    6

我需要那个df来执行相关分析。

2 个答案:

答案 0 :(得分:5)

merge上听起来像h==i,同时保留i,因此请创建一个新变量x来加入,并保持双方的加入结果({{ 1}})。给@Moody_Mudskipper一个大帽子:

all=TRUE

答案 1 :(得分:1)

我们也可以使用dplyr

执行此操作
library(dplyr)
a %>% 
  mutate(x = h) %>%
  full_join(mutate(b, x = i)) %>%
  select(-x)