道歉,如果这是重复请告诉我,我很乐意删除。
我使用merge
在R中合并两个数据集。
age1 = c(5, 6, 7, 8, 10, 11)
fname1 = c('david','alan','ben', 'ben', 'richard', 'edd')
sname1 = c('albert','raymond','albert','pete','raymond', 'alan')
area1 = c('r','t','n','x','z','w')
df1 <- data.frame(age1, fname1, sname1, area1)
age2 = c(5, 9, 10, 3, 4, 0)
fname2 = c('david','alan','david', 'ben', 'richard', 'edd')
sname2 = c('albert','edd','albert','pete','raymond', 'alan')
area2 = c('w','z','x','n','t','r')
df2 = data.frame(age2, fname2, sname2, area2)
数据集1:
df1
age1 fname1 sname1 area1
1 5 david albert r
2 6 alan raymond t
3 7 ben albert n
4 8 ben pete x
5 10 richard raymond z
6 11 edd alan w
数据集2
df2
age2 fname2 sname2 area2
1 5 david albert w
2 9 alan edd z
3 10 david albert x
4 3 ben pete n
5 4 richard raymond t
6 0 edd alan r
我在fname
和sname
上合并:
matchkey <- merge(df1, df2, by.x = c("fname1", "sname1"), by.y = c("fname2", "sname2"))
View(matchkey)
输出:
> matchkey
fname1 sname1 age1 area1 age2 area2
1 ben pete 8 x 3 n
2 david albert 5 r 5 w
3 david albert 5 r 10 x
4 edd alan 11 w 0 r
5 richard raymond 10 z 4 t
但是,我希望保留我合并的列。我怎样才能做到这一点?我应该使用合并以外的东西吗?
预期产出:
fname1 sname1 age1 area1 fname2 sname 2age2 area2
1 ben pete 8 x ben pete 3 n
2 david albert 5 r david albert 5 w
3 david albert 5 r david albert 10 x
4 edd alan 11 w edd alan 0 r
5 richard raymond 10 z richard raymond 4 t
我试着看,但没有成功:
How do I combine two data-frames based on two columns?
Combining two dataframes keeping all columns
Merge two dataframes with repeated columns
非常感谢。
答案 0 :(得分:1)
由于合并列在内部联接上完全相同或数据框之间完全匹配,因此只需将新列分配给剩余的列。您可以使用transform()
执行此操作。下面添加outer()
组合paste0
以检索所需的列顺序:
matchkey <- transform(merge(df1, df2, by.x = c("fname1", "sname1"),
by.y = c("fname2", "sname2")),
fname2 = fname1, sname2 = sname1)
ordercols <- c(outer(c("fname", "sname", "age", "area"), c(1:2), paste0))
matchkey <- matchkey[ordercols]
matchkey
# fname1 sname1 age1 area1 fname2 sname2 age2 area2
# 1 ben pete 8 x ben pete 3 n
# 2 david albert 5 r david albert 5 w
# 3 david albert 5 r david albert 10 x
# 4 edd alan 11 w edd alan 0 r
# 5 richard raymond 10 z richard raymond 4 t