我有以下df:
>> animals_df:
animal_name age
cat 1
cat 1
cat 2
cat 3
cat 3
dog 1
dog 1
dog 3
dog 4
dog 4
dog 4
horse 1
horse 3
horse 5
horse 5
horse 5
我想从每个物种中仅提取年龄最高的动物。所以我想得到以下输出:
animal_name age
cat 3
cat 3
dog 4
dog 4
dog 4
horse 5
horse 5
horse 5
我尝试过使用:
animals_df = do.call(rbind,lapply(split(animals_df, animals_df$animal_name), function(x) tail(x, 1) ) )
但这只会给出每只动物的一个实例,如下所示:
animals_name age
cat 3
dog 4
horse 5
答案 0 :(得分:4)
remove
/ dplyr
:
tidyverse
答案 1 :(得分:1)
另一个data.table
选项是:
library(data.table)
setDT(df)
df[, .SD[which(age == max(age))], by = animal_name]
# animal_name age
#1: cat 3
#2: cat 3
#3: dog 4
#4: dog 4
#5: dog 4
#6: horse 5
#7: horse 5
#8: horse 5