R合并列上的两个数据帧保留列

时间:2017-07-19 15:11:13

标签: r join dataframe merge concatenation

道歉,如果这是重复请告诉我,我很乐意删除。

我使用merge在R中合并两个数据集。

age1 = c(5, 6, 7, 8, 10, 11) 
fname1 = c('david','alan','ben', 'ben', 'richard', 'edd') 
sname1 = c('albert','raymond','albert','pete','raymond', 'alan')
area1 = c('r','t','n','x','z','w')

df1 <- data.frame(age1, fname1, sname1, area1)

age2 = c(5, 9, 10, 3, 4, 0) 
fname2 = c('david','alan','david', 'ben', 'richard', 'edd') 
sname2 = c('albert','edd','albert','pete','raymond', 'alan')
area2 = c('w','z','x','n','t','r')

df2 = data.frame(age2, fname2, sname2, area2)

数据集1:

df1
  age1  fname1  sname1 area1
1    5   david  albert     r
2    6    alan raymond     t
3    7     ben  albert     n
4    8     ben    pete     x
5   10 richard raymond     z
6   11     edd    alan     w

数据集2

df2
  age2  fname2  sname2 area2
1    5   david  albert     w
2    9    alan     edd     z
3   10   david  albert     x
4    3     ben    pete     n
5    4 richard raymond     t
6    0     edd    alan     r

我在fnamesname上合并:

matchkey <- merge(df1, df2, by.x = c("fname1", "sname1"), by.y = c("fname2", "sname2"))
View(matchkey)

输出:

> matchkey
   fname1  sname1 age1 area1 age2 area2
1     ben    pete    8     x    3     n
2   david  albert    5     r    5     w
3   david  albert    5     r   10     x
4     edd    alan   11     w    0     r
5 richard raymond   10     z    4     t

但是,我希望保留我合并的列。我怎样才能做到这一点?我应该使用合并以外的东西吗?

预期产出:

   fname1  sname1 age1 area1  fname2   sname 2age2  area2
1     ben    pete    8     x  ben    pete      3       n
2   david  albert    5     r  david  albert    5       w
3   david  albert    5     r  david  albert    10      x
4     edd    alan   11     w  edd    alan      0       r
5 richard raymond   10     z  richard raymond  4       t

我试着看,但没有成功:

How do I combine two data-frames based on two columns?

Combining two dataframes keeping all columns

Merge two dataframes with repeated columns

非常感谢。

1 个答案:

答案 0 :(得分:1)

由于合并列在内部联接上完全相同或数据框之间完全匹配,因此只需将新列分配给剩余的列。您可以使用transform()执行此操作。下面添加outer()组合paste0以检索所需的列顺序:

matchkey <- transform(merge(df1, df2, by.x = c("fname1", "sname1"), 
                                      by.y = c("fname2", "sname2")),
                      fname2 = fname1, sname2 = sname1)

ordercols <- c(outer(c("fname", "sname", "age", "area"), c(1:2), paste0))
matchkey <- matchkey[ordercols]

matchkey    
#    fname1  sname1 age1 area1  fname2  sname2 age2 area2
# 1     ben    pete    8     x     ben    pete    3     n
# 2   david  albert    5     r   david  albert    5     w
# 3   david  albert    5     r   david  albert   10     x
# 4     edd    alan   11     w     edd    alan    0     r
# 5 richard raymond   10     z richard raymond    4     t