我有两个数据框a
和b
我想要合并
a <- data.frame(g=c("1","2","2","3","3","3","4","4","4","4"),h=c("1","1","2","1","2","3","1","2","3","4"))
b <- data.frame(g=c("1","2","3","3","3","4","4","4","4","4"),i=c("1","2","3","2","1","2","3","4","5","6"))
g
代表一个分组变量,h
和i
代表我要合并/加入的
> a
g h
1 1 1
2 2 1
3 2 2
4 3 1
5 3 2
6 3 3
7 4 1
8 4 2
9 4 3
10 4 4
> b
g i
1 1 1
2 2 2
3 3 3
4 3 2
5 3 1
6 4 2
7 4 3
8 4 4
9 4 5
10 4 6
a
和b
应合并在分组变量g
的级别上,而h
和i
的相同值应放在一起(独立)它们出现在h
/ i
)中的顺序和不相同的值应该组合一次(不是所有可能的组合)。
最终df
看起来像是:
g h i
1 1 1 1
2 2 1 <NA>
3 2 2 2
4 3 1 1
5 3 2 2
6 3 3 3
7 4 1 <NA>
8 4 2 2
9 4 3 3
10 4 4 4
11 4 <NA> 5
12 4 <NA> 6
我需要那个df来执行相关分析。
答案 0 :(得分:5)
在merge
上听起来像h==i
,同时保留i
,因此请创建一个新变量x
来加入,并保持双方的加入结果({{ 1}})。给@Moody_Mudskipper一个大帽子:
all=TRUE
答案 1 :(得分:1)
我们也可以使用dplyr
library(dplyr)
a %>%
mutate(x = h) %>%
full_join(mutate(b, x = i)) %>%
select(-x)