我有两个数据帧
db1 like:
date.prix;var1;var2
2012-10-02;pluf;plof
2012-12-11;pam;pim
2013-05-17;plop;plip
...
db2 like:
date.de.cotation;var3;var4
2012-10-02;tutu;toto
2012-10-02;ting;tong
2013-05-17;gui;guou
...
联接是date.prix = date.de.cotation
我想要的是:
date.prix;var1;var2;var3;var4
2012-10-02;pluf;plof;tutu;toto
2012-12-11;pam;pim;NA;NA
2013-05-17;plop;plip;gui;guou
所以:
答案 0 :(得分:2)
我们可以使用duplicated
和merge
函数:
db2_2 <- db2[!duplicated(db2$date.de.cotation), ] # remove everything but first instance
merge(db1, db2_2, by.x = 'date.prix', by.y = 'date.de.cotation', all.x = TRUE)
# date.prix var1 var2 var3 var4
# 1 2012-10-02 pluf plof tutu toto
# 2 2012-12-11 pam pim <NA> <NA>
# 3 2013-05-17 plop plip gui guou
答案 1 :(得分:2)
data.table
中的左连接有一个mult
参数:mult='first'
只保留db2
中的第一个匹配行。
library(data.table)
db1 <- fread('date.prix;var1;var2
2012-10-02;pluf;plof
2012-12-11;pam;pim
2013-05-17;plop;plip')
db2 <- fread('date.de.cotation;var3;var4
2012-10-02;tutu;toto
2012-10-02;ting;tong
2013-05-17;gui;guou')
# if db1 and db2 are not data.table, do: setDT(db1); setDT(db2);
db2[db1, on = .(date.de.cotation = date.prix), mult = 'first']
# date.de.cotation var3 var4 var1 var2
# 1: 2012-10-02 tutu toto pluf plof
# 2: 2012-12-11 NA NA pam pim
# 3: 2013-05-17 gui guou plop plip