我有两个数据帧
df1 = data.frame(Sites=c("A","B","C"),total=c(12,6,35))
df2 = data.frame(Site.1=c("A","A","B"),Site.2=c("B","C","C"), Score=c(60,70,80))
我需要将它们合并以生成数据帧
df3=data.frame(Site.1=c("A","A","B"),Site.2=c("B","C","C"),
Score=c(60,70,80),Site.1.total=c(12,12,6),Site.2.total=c(6,35,35))
有关最简单的双重合并方法的建议吗?感谢
答案 0 :(得分:4)
只需merge
两次:
x <- merge(df2, df1, all.x=TRUE, by.x="Site.2", by.y="Sites", sort=FALSE)
merge(x, df1, all.x=TRUE, by.x="Site.1", by.y="Sites", sort=FALSE)
Site.1 Site.2 Score total.x total.y
1 A B 60 6 12
2 A C 70 35 12
3 B C 80 35 6
答案 1 :(得分:1)
以下是一些sqldf解决方案。
首先让我们重命名名称中包含点的列,以删除点,因为dot是SQL运算符。 (如果我们不希望这样做,我们可以将SQL语句中的那些列称为Site_1
和Site_2
,并且它会理解我们指的是Site.1
和{{1 }。)
Site.2
现在我们有了输入,可以尝试使用sqldf的几种方法:
带有三个sql语句的sqldf
library(sqldf)
df1 = data.frame(Sites = c("A","B","C"), total = c(12,6,35))
df2 = data.frame(Site1 = c("A","A","B"), Site2 = c("B","C","C"),
Score = c(60,70,80))
sqldf缩减为三联接
我们可以进一步将上述内容简化为三连接,这可能会澄清计算的本质。也就是说,上面的三个SQL语句可以简化为这个语句:
temp1 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site1 ")
temp2 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site2 ")
sqldf("SELECT
Site1,
b.Site2,
a.Score,
a.Total as Site1Total,
b.Total as Site2Total
FROM temp1 as a, temp2 as b
USING (Site1)
GROUP BY a.Total, b.Total")