Question

我有df1：

    State       date fips score    score1 
1 Alabama 2020-03-24    1   242      0 
2 Alabama 2020-03-26    1   538      3
3 Alabama 2020-03-28    1   720      4
4 Alabama 2020-03-21    1   131      0
5 Alabama 2020-03-15    1    23      0 
6 Alabama 2020-03-18    1    51      0
7 Texas   2020-03-14    2    80      0
7 Texas   2020-03-16    2    102     0
7 Texas   2020-03-20    2    702     1
8 Texas   2020-03-23    2    1005    1

我想知道一个州的得分超过100的日期。然后，我想选择该日期之后7天的行？例如，阿拉巴马州在3月21日通过100，所以我想保留3月28日的数据。

    State       date fips  score    score1 
3 Alabama 2020-03-28    1    720      4
8 Texas   2020-03-23    2    1005     1

Answer 1

使用by方法（假设日期+ 7是可用的）。

res <- do.call(rbind, by(dat, dat$state, function(x) {
  st <- x[x$cases > 100, ]
  st[as.Date(st$date) == as.Date(st$date[1]) + 7, ]
}))
head(res)
#                  date      state fips cases deaths
# Alabama    2020-03-27    Alabama    1   639      4
# Alaska     2020-04-04     Alaska    2   169      3
# Arizona    2020-03-28    Arizona    4   773     15
# Arkansas   2020-03-28   Arkansas    5   409      5
# California 2020-03-15 California    6   478      6
# Colorado   2020-03-21   Colorado    8   475      6

Answer 2

这是解决方案tidyverse和lubridate。

library(tidyverse)
library(lubridate)

df %>%
  #Convert date column to date format
  mutate_at(vars(date), ymd) %>%
  #Group by State
  group_by(State) %>%
  #Ignore scores under 100
  filter(score > 100) %>%
  #Stay only with the date of the first date with score over 100 + 7 days
  filter(date == min(date) + days(7))

如何根据另一行的日期条件选择一行？

2 个答案: