我有两个列表(列表中的数据框包含的列多于两个,但对我的问题而言并不重要):
KPI_new <- list(June=data.frame(ID=(rep("",17)), eRec= c("107349", "110878", "110024", "112188", "6187", "100420", "94436", "110165", "108508", "108773", "111859", "111907", "110704", "100413", "88995", "91644","111298") ))
KPI_old <- list(May=data.frame(ID=c(27, 30, 4, 6, 7, 20, 31, 8, 28, 25, 29, 16, 17, 18), eRec = c( "107349", "110024", "6187" , "100420", "94436", "88995" , "110165" ,"91644", "108508", "105213", "108773", "102636" ,"102339" ,"100413")),
April = data.frame(ID=c(26, 27, 2, 4, 5, 6, 7, 20, 21, 22, 8, 23, 28, 25, 29, 9, 24, 16, 17, 18), eRec=c("37866", "107349", "93051", "6187", "98274", "100420", "94436", "88995" ,"105107", "105109", "91644", "105103" ,"108508" ,"105213", "108773", "85409" ,"104145","102636" ,"102339" ,"100413")),
March = data.frame(ID= c(2, 19, 4, 5, 6, 7, 20, 21, 22, 8, 23, 25, 9, 24, 15, 16, 17, 18), eRec=c("93051" , "104499" ,"6187", "98274", "100420" ,"94436", "88995" ,"105107" ,"105109", "91644" ,"105103", "105213" ,"85409" , "104145", "100989", "102636" ,"102339", "100413")),
February = data.frame(ID= c(1 , 2, 19, 4, 5, 6, 7 ,20, 21, 22, 8, 23, 9 ,10, 24, 12, 13, 14, 15, 16, 17, 18), eRec=c("94266" , "93051", "104499" ,"6187" , "98274", "100420", "94436" ,"88995", "105107", "105109", "91644" ,"105103", "85409" ,"102252", "104145", "94559", "101426", "100992" ,"100989" ,"102636" ,"102339" ,"100413")),
January = data.frame(ID = c(1:18), eRec=c("94266" , "93051", "99836", "6187" , "98274", "100420", "94436", "91644", "85409", "102252", "94412", "94559", "101426", "100992", "100989", "102636", "102339", "100413")))
列表KPI_old
包含几个数据框。根据eRec列分配ID列。因此,如果eRec列在1月和2月也存在,则ID是相同的。
现在,我想基于KPI_new
向KPI_old
列表中数据框的ID列(此时为空)分配ID。
我尝试了以下操作:
KPI_old_df <- do.call("rbind", KPI_old)
KPI_new[[1]]$ID[(KPI_new[[1]][,2]) %in% KPI_old_df[,2]] <- unique(KPI_old_df$ID[(KPI_old_df[,2]) %in% KPI_new[[1]][,2]])
这将分配正确的值-已经在KPI_old中出现的KPI_new中的eRec值的KPI_old到KPI_new的ID-但它会将其中一些分配给错误的行。顺序不正确。 似乎我缺少一些非常基本的东西。
谢谢。
答案 0 :(得分:0)
尝试通过以下方式使用match
KPI_new[[1]]$ID <- KPI_old_df$ID[match(KPI_new[[1]]$eRec, KPI_old_df$eRec)]
KPI_new
#$June
# ID eRec
#1 27 107349
#2 NA 110878
#3 30 110024
#4 NA 112188
#5 4 6187
#6 6 100420
#7 7 94436
#8 31 110165
#9 28 108508
#10 29 108773
#11 NA 111859
#12 NA 111907
#13 NA 110704
#14 18 100413
#15 20 88995
#16 8 91644
#17 NA 111298
并非所有ID
都存在于KPI_old_df
中,因此其中一些返回NA
。