我有一个包含许多国家/地区的数据框,它们的总病例数和不同日期的新病例数。它看起来如下:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-13 2 2 0 106766
2 ABW North America Aruba 2020-03-19 NA NA 33.3 106766
3 ABW North America Aruba 2020-03-20 4 2 33.3 106766
4 ABW North America Aruba 2020-03-21 NA NA 44.4 106766
5 ABW North America Aruba 2020-03-22 NA NA 44.4 106766
6 ABW North America Aruba 2020-03-23 NA NA 44.4 106766
我能够过滤数据框以获取new_cases> = 5的所有行:
df_filtered <- df %>% filter(new_cases >= 5)
但是,这为我提供了new_cases等于或大于5的所有行:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 ABW North America Aruba 2020-03-25 17 5 44.4 106766
3 ABW North America Aruba 2020-03-27 28 9 44.4 106766
4 ABW North America Aruba 2020-03-30 50 22 85.2 106766
5 ABW North America Aruba 2020-04-01 55 5 85.2 106766
6 ABW North America Aruba 2020-04-03 60 5 85.2 106766
我怎么只能获得具有该条件的最早/第一日期的行?
这是我的输出理想的样子:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 AFG Asia Afghanistan 2020-03-16 16 6 38.9 38928341
3 AGO Africa Angola 2020-04-19 24 5 90.7 32866268
4 ALB Europe Albania 2020-03-13 23 12 78.7 2877800
5 AND Europe Andorra 2020-03-17 14 9 31.4 77265
6 ARE Asia Utd. Arab Emirates 2020-02-28 19 6 8.3 9890400
答案 0 :(得分:1)
尝试一下:
df %>%
group_by(iso_code) %>% ## within each country (group)
filter(new_cases >= 5) %>% ## keep rows where there are at least 5 cases
slice_min(date, n = 1, with_ties = FALSE) ## then keep the row with the smallest date
答案 1 :(得分:0)
我可以将其与以下代码一起使用:
df_filtered <- df %>% filter(new_cases >= 5) #filter all new_cases with at least 5
df_sorted <- df_filtered %>% #group by country and arrange by date,
group_by(iso_code) %>% #then get the first row of every
arrange(date) %>% #group
slice(1L)
受此问题Earliest Date for each id in R的答案的启发