我有两个数据库:
dfgenus<- c("Coragyps" ,"Elanus", "Elanus", "Patagioenas", "Crotophaga")
如此
dfgenus
Genus
1 Coragyps
2 Elanus
3 Elanus
4 Patagioenas
5 Crotophaga
和
family <-c("Cathartidae", "Accipitridae","Cuculidae", "Columbidae","Psittacidae")
Genus <- c("Coragyps" ,"Elanus", "Crotophaga", "Patagioenas", "Pyrrhura")
sacc<- data.frame(family, genus)
##Sacc db rows are in the right order (the genus belongs to its taxonomic family)
sacc
family Genus
1 Cathartidae Coragyps
2 Accipitridae Elanus
3 Cuculidae Crotophaga
4 Columbidae Patagioenas
5 Psittacidae Pyrrhura
在有关“ sacc”的信息之后,如何为“ dbgenus”中的每个属添加正确的家族?
我一直未尝试:
for (i in length(dfgenus)){
if (identical(sacc[i], dfgenus[i])) {
df$family[i] <- sacc$family[i]
} else {
i-1
}
print(df$family)
}
输出应为:
df
family Genus
1 Cathartidae Coragyps
2 Accipitridae Elanus
3 Accipitridae Elanus
4 Columbidae Patagioenas
5 Cuculidae Crotophaga
答案 0 :(得分:1)
使用dplyr解决方案:
library(dplyr)
dbgenus<- data.frame(genus = c("Coragyps" ,"Elanus", "Elanus", "Patagioenas", "Crotophaga"))
family <-c("Cathartidae", "Accipitridae","Cuculidae", "Columbidae","Psittacidae")
genus <- c("Coragyps" ,"Elanus", "Crotophaga", "Patagioenas", "Pyrrhura")
sacc<- data.frame(family, genus)
dbgenus %>% left_join(sacc)
答案 1 :(得分:1)
有几种方法可以实现您的结果。它们都不应该涉及for循环:)
如果将dfgenus
做成一个数据帧(只有一列),则可以研究merge()
函数或dplyr
包中的联接函数。
但是使用现有数据,您可以使用match()
:
newdf <- data.frame(Genus = dfgenus,
Family = sacc[match(dfgenus, sacc$Genus), "family"])
Genus Family
1 Coragyps Cathartidae
2 Elanus Accipitridae
3 Elanus Accipitridae
4 Patagioenas Columbidae
5 Crotophaga Cuculidae
match
从sacc
返回匹配的行号,然后将其用于从family
列返回子集。