我一直无法找到答案。 stackoverflow上可能有一个...但我还没找到一个我可以使用的。
我有两个数据框(db.1和db.larger)。我需要做的是:
if db.1$ID == db.larger$ID
db1$Gender <- db.larger$Gender
如果ID匹配,我需要将Gender值从db.larger复制到db.1。
我无法使用匹配,因为db.1中有多个人出现
Merge对我没有用,因为它向数据框添加了比我想要的更多的数据(列)。
以下是示例输出文件:
db.1 <- structure(list(ID = c("453", "286", "345", "853", "675", "754","445", "564", "651", "685", "453", "286", "345"), Gender = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Name = c("Rashad Lawrence", "Ali Santana", "Cordell Cobb", "Amani Bennett", "Donavan Frank", "Jeffrey Michael", "Aliana Trujillo", "Cheyanne Wyatt", "Kayden Padilla", "Jasmine Glass", "Rashad Lawrence", "Ali Santana", "Cordell Cobb"), Score = c(0, 0.044, 0.822, 0.322, 0.394, 0.309, 0.826, 0.729, 0.318, 0.6, 0.648, 0.547, 0.53)), .Names = c("ID", "Gender","Name", "Score"), row.names = c(NA, -13L), class = "data.frame")
和
db.larger <- structure(list(ID = c("123", "158", "286", "345", "445", "453", "469", "546", "564", "566", "651", "675", "682", "685", "741", "754", "789", "852", "853", "963"), Gender = c(1, 1, 2, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1), Name = c("Dexter Holmes", "Roman Macias", "Ali Santana", "Cordell Cobb", "Aliana Trujillo", "Rashad Lawrence", "Preston Mckee", "Kyra Howe", "Cheyanne Wyatt", "Tobias Hart", "Kayden Padilla", "Donavan Frank", "Jamie Yoder", "Jasmine Glass", "Jamar Carter", "Jeffrey Michael", "Erick Tate", "Darion Graves", "Amani Bennett", "Regina Sanders")), .Names = c("ID", "Gender", "Name"), row.names = c(NA, 20L), class = "data.frame")
答案 0 :(得分:0)
由于您在db.1$Gender
中始终缺少值,因此您可以删除此列,然后从inner_join
执行dplyr
。此过程将重复项保留在db.1
。
library(dplyr)
db.1 <- db.1 %>%
select(-Gender)
db.combine <- inner_join(db.1,db.larger, by = "ID")
db.combine
ID Name.x Gender Name.y
1 453 Rashad Lawrence 1 Rashad Lawrence
2 286 Ali Santana 2 Ali Santana
3 345 Cordell Cobb 1 Cordell Cobb
4 853 Amani Bennett 1 Amani Bennett
5 675 Donavan Frank 2 Donavan Frank
6 754 Jeffrey Michael 2 Jeffrey Michael
7 445 Aliana Trujillo 1 Aliana Trujillo
8 564 Cheyanne Wyatt 2 Cheyanne Wyatt
9 651 Kayden Padilla 2 Kayden Padilla
10 685 Jasmine Glass 2 Jasmine Glass
11 453 Rashad Lawrence 1 Rashad Lawrence
12 286 Ali Santana 2 Ali Santana
13 345 Cordell Cobb 1 Cordell Cobb
您的Name
变量显然不是完美匹配,但您只需使用Name.x
删除Name.y
或select
。