考虑我有两个data.frames:
A<-data.frame(a=c("b","a", "a", "e", "e","a"),Za=c(11,22,33,44,55,66))
B<-data.frame(b=c("a","a", "b", "e", "f","f"),Zb=c(11,22,33,44,55,66))
现在我想根据列a和b匹配它们,但是要维持每个可能的组合。所以最后我想要:
Anew<-data.frame(a=c("a","a","a","a","a","a","b","e","e","f","f"),Za=c(11,11,11,22,22,22,33,44,44,55,66))
Bnew<-data.frame(b=c("a","a","a","a","a","a","b","e","e",NA,NA),Zb=c(22,33,66,22,33,66,11,44,55,NA,NA))
Anew
a Za
1 a 11
2 a 11
3 a 11
4 a 22
5 a 22
6 a 22
7 b 33
8 e 44
9 e 44
10 f 55
11 f 66
Bnew
b Zb
1 a 22
2 a 33
3 a 66
4 a 22
5 a 33
6 a 66
7 b 11
8 e 44
9 e 55
10 <NA> NA
11 <NA> NA
如果可能的话,我不想使用ncomb,因为我的矢量真的非常庞大,这会扼杀我的记忆。快速运行的解决方案将是完美的!
非常感谢您的帮助!
答案 0 :(得分:1)
如果您正在使用大型数据集,请不要使用data.frame,而是使用data.table。这是一个解决方案:
A<-data.table(a=c("b","a", "a", "e", "e","a"),Za=c(11,22,33,44,55,66))
B<-data.table(b=c("a","a", "b", "e", "f","f"),Zb=c(11,22,33,44,55,66))
df <- merge(A, B, by.x="a",by.y="b", all = TRUE)
df[,Match := ifelse(!is.na(Za),1,0)]
a Za Zb Match
1: a 22 11 1
2: a 22 22 1
3: a 33 11 1
4: a 33 22 1
5: a 66 11 1
6: a 66 22 1
7: b 11 33 1
8: e 44 44 1
9: e 55 44 1
10: f NA 55 0
11: f NA 66 0