假设我要合并两个data.frames,但有些列是冗余的(相同)。我如何合并这些data.frames但删除冗余列?
X1 = data.frame(id = c("a","b","c"), same = c(1,2,3), different1 = c(4,5,6))
X2 = data.frame(id = c("b","c","a"), same = c(2,3,1), different2 = c(7,8,9))
merge(X1,X2, by="id", all = TRUE, sort = FALSE)
id same.x different1 same.y different2
1 a 1 4 1 9
2 b 2 5 2 7
3 c 3 6 3 8
但是我怎样才能得到不同的1和不同的2列?
id same different1 different2
1 a 1 4 9
2 b 2 5 7
3 c 3 6 8
答案 0 :(得分:5)
您可以在by参数中包含相同的列。默认值为by=intersect(names(x), names(y))
。试试merge(X1, X2)
(与merge(X1, X2, by=c("id", "same"))
相同):
merge(X1, X2)
# id same different1 different2
#1 a 1 4 9
#2 b 2 5 7
#3 c 3 6 8
答案 1 :(得分:1)
只需通过合并语句中的索引进行子集化。有许多方法可以分组,即名称,位置。甚至有一个子集函数,但[]表示法适用于几乎所有情况
merge(X1[,c("id","same","different1")], X2[,c("id","different2")], by="id", all = TRUE, sort = FALSE)
如其他示例所示,您可以将其放入by语句中,但在退出一对一合并领域并输入一对多或多对多合并后,这将成为一个问题。< / p>