合并data.frame但只保留唯一列?

时间:2014-03-18 13:26:59

标签: r

假设我要合并两个data.frames,但有些列是冗余的(相同)。我如何合并这些data.frames但删除冗余列?

X1 = data.frame(id = c("a","b","c"), same = c(1,2,3), different1 = c(4,5,6))
X2 = data.frame(id = c("b","c","a"), same = c(2,3,1), different2 = c(7,8,9))


merge(X1,X2, by="id", all = TRUE, sort = FALSE)




id same.x different1 same.y different2
1  a      1          4      1          9
2  b      2          5      2          7
3  c      3          6      3          8

但是我怎样才能得到不同的1和不同的2列?

id same different1 different2
1  a    1     4      9
2  b    2     5      7
3  c    3     6      8

2 个答案:

答案 0 :(得分:5)

您可以在by参数中包含相同的列。默认值为by=intersect(names(x), names(y))。试试merge(X1, X2)(与merge(X1, X2, by=c("id", "same"))相同):

 merge(X1,  X2)
 #  id same different1 different2
 #1  a    1          4          9
 #2  b    2          5          7
 #3  c    3          6          8

答案 1 :(得分:1)

只需通过合并语句中的索引进行子集化。有许多方法可以分组,即名称,位置。甚至有一个子集函数,但[]表示法适用于几乎所有情况

merge(X1[,c("id","same","different1")], X2[,c("id","different2")], by="id", all = TRUE, sort = FALSE)

如其他示例所示,您可以将其放入by语句中,但在退出一对一合并领域并输入一对多或多对多合并后,这将成为一个问题。< / p>