我的数据框看起来像这样(df1):
options(stringsAsFactors = F)
Car <- c('Chevrolet', 'GM', 'Lexus', 'Ford', 'Mitsubishi', 'Audi')
Alternative.Cars <- c('Acura, Honda', NA, 'Nissan', 'Ferrari, Toyota, Tesla', 'Infiniti', 'Cadillac, Benz')
contact <- c('Mickey', 'Minnie', 'Daffy', 'Pluto', 'Donald', 'Goofy')
numCars <- c(0,0,0,0,0,0)
df1 <- data.frame(Car = Car, Alternative.Cars = Alternative.Cars, contact = contact, numCars = numCars)
和第二个数据框(df2):
CAR <- c('Acura', 'Benz', 'Toyota', 'Nissan')
years <- c('Y2001,Y2003','Y2014', 'Y1999,Y2015,Y2016', 'Y2013')
df2 <- data.frame(CAR = CAR, years = years)
DF1是目前缺货的汽车库存清单。 DF2,是包含年份和品牌的汽车列表。您会注意到DF2具有仅在DF1 $ Alternative.Cars列中列出的车名。由于某种原因,有一个巨大的混合,它需要直接设置。我想创建一个名为&#39; Real.Car&#39;在DF2 $ Alternative.Cars中找到DF1 $ CAR的汽车名称,所以它看起来像下面的数据框(DF3):
Car Alternative.Cars Real.Car contact numCars years
Chevrolet Acura, Honda Acura Mickey 0 Y2001,Y2003
GM <NA> <NA> Minnie 0 <NA>
Lexus Nissan Nissan Daffy 0 Y2013
Ford Ferrari, Toyota, Tesla Toyota Pluto 0 Y1999,Y2015,Y2016
Mitsubishi Infiniti <NA> Donald 0 <NA>
Audi Cadillac, Benz Benz Goofy 0 Y2014
我尝试查看df2中的每一行并在df1中找到匹配项。如果匹配,我想将这些年份保存在一个名为“年”的新列中。但它没有正确运作:
for (i in 1:length(df2$CAR)) {
singleEntry <- as.data.frame(df2[1,])
df1$Car <- ifelse(grep(singleEntry, df1$Alternative.Cars),
singleEntry$years, df1$Car)
}
任何帮助将不胜感激!
答案 0 :(得分:3)
这可能不是最有效的方式,但它可以完成这项工作。
df1$Real.car = NA
df1$years = NA
for (i in 1:nrow(df2)) {
match.line = grep(df2$CAR[i], df1$Alternative.Cars)
df1$Real.car[match.line] = df2$CAR[i]
df1$years[match.line] = df2$years[i]
}
然后,您可以重新排序列以与您的列匹配。
> df1
Car Alternative.Cars contact numCars Real.car years
1 Chevrolet Acura, Honda Mickey 0 Acura Y2001,Y2003
2 GM <NA> Minnie 0 <NA> <NA>
3 Lexus Nissan Daffy 0 Nissan Y2013
4 Ford Ferrari, Toyota, Tesla Pluto 0 Toyota Y1999,Y2015,Y2016
5 Mitsubishi Infiniti Donald 0 <NA> <NA>
6 Audi Cadillac, Benz Goofy 0 Benz Y2014