我对编程非常陌生,正在使用R studio。我有2个数据框:一个包含一列“动物”以及其他数据,例如V1(请参见df1),另一个数据框则是较短的df,这是为每种动物类型指定唯一编号的关键(请参见df2)。示例:
df1 <- data.frame("V1"=c(33,45,21,78,45), "Animal"=c("Dog","Dog","Horse","Cat","Dog"))
df2 <- data.frame("Key"=c(1,2,3), "Animal"=c("Dog","Cat","Horse")
看起来像这样
V1 Animal
1 33 Dog
2 45 Dog
3 21 Horse
4 78 Cat
5 45 Dog
df2:
Key Animal
1 1 Dog
2 2 Cat
3 3 Horse
基本上,我想在df1中添加一列,以指定每种动物类型所涉及的数字,并以如下示例所示结束:
df1
V1 Animal Key
1 33 Dog 1
2 45 Dog 1
3 21 Horse 3
4 78 Cat 2
5 45 Dog 1
我尝试了这个: df1%>%mutate(total = ifelse(Animal == df2 $ Animal),df2 $ Key,as.character(“ NA”))
但是下面出现了错误消息,部分原因是因为2个df的行数不同。
Error in ifelse(Animal == df2$Animal) :
argument "yes" is missing, with no default
In addition: Warning messages:
1: In `==.default`(Animal, df2$Animal) :
longer object length is not a multiple of shorter object length
2: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
非常感谢任何帮助,谢谢!
答案 0 :(得分:0)
这项工作:
> df1$Key <- df2$Key[match(df1$Animal, df2$Animal)]
> df1
V1 Animal Key
1 33 Dog 1
2 45 Dog 1
3 21 Horse 3
4 78 Cat 2
5 45 Dog 1
>
答案 1 :(得分:0)
这是dplyr
和lots more details here的right_join
解决方案,如果认识到这是“联接”或合并
library(dplyr)
right_join(df1, df2)
#> Joining, by = "Animal"
#> V1 Animal Key
#> 1 33 Dog 1
#> 2 45 Dog 1
#> 3 21 Horse 3
#> 4 78 Cat 2
#> 5 45 Dog 1
答案 2 :(得分:0)
一个简单的基本R选项正在使用merge
> merge(df1,df2)
Animal V1 Key
1 Cat 78 2
2 Dog 33 1
3 Dog 45 1
4 Dog 45 1
5 Horse 21 3