在r中双合并两个数据帧

时间:2012-07-12 08:48:24

标签: r merge dataframe

我有两个数据帧

df1 = data.frame(Sites=c("A","B","C"),total=c(12,6,35))

df2 = data.frame(Site.1=c("A","A","B"),Site.2=c("B","C","C"), Score=c(60,70,80))

我需要将它们合并以生成数据帧

df3=data.frame(Site.1=c("A","A","B"),Site.2=c("B","C","C"),
Score=c(60,70,80),Site.1.total=c(12,12,6),Site.2.total=c(6,35,35))

有关最简单的双重合并方法的建议吗?感谢

2 个答案:

答案 0 :(得分:4)

只需merge两次:

x <- merge(df2, df1, all.x=TRUE, by.x="Site.2", by.y="Sites", sort=FALSE)
merge(x, df1, all.x=TRUE, by.x="Site.1", by.y="Sites", sort=FALSE)

  Site.1 Site.2 Score total.x total.y
1      A      B    60       6      12
2      A      C    70      35      12
3      B      C    80      35       6

答案 1 :(得分:1)

以下是一些sqldf解决方案。

首先让我们重命名名称中包含点的列,以删除点,因为dot是SQL运算符。 (如果我们不希望这样做,我们可以将SQL语句中的那些列称为Site_1Site_2,并且它会理解我们指的是Site.1和{{1 }。)

Site.2

现在我们有了输入,可以尝试使用sqldf的几种方法:

带有三个sql语句的

sqldf

library(sqldf)
df1 = data.frame(Sites = c("A","B","C"), total = c(12,6,35))
df2 = data.frame(Site1 = c("A","A","B"), Site2 = c("B","C","C"), 
           Score = c(60,70,80))

sqldf缩减为三联接

我们可以进一步将上述内容简化为三连接,这可能会澄清计算的本质。也就是说,上面的三个SQL语句可以简化为这个语句:

temp1 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site1 ")  
temp2 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site2 ")

sqldf("SELECT 
    Site1,
    b.Site2,
    a.Score, 
    a.Total as Site1Total, 
    b.Total as Site2Total 
FROM temp1 as a,  temp2 as b 
USING (Site1)
GROUP BY a.Total, b.Total")