是否存在一种方法或功能,可以根据观察到数据的日期范围以相同的ID来对数据进行子集或过滤?我浏览了许多使用dplyr和lubridate和
的示例Something similar maybe?
DF %>%
group_by(ID) %>%
filter_if(for i %in% Date, between("Date 1 & Date 2 is at least 6 months"))
OR
DF %>%
filter_if(ID = >3 & between("Date 1 & Date 2 is at least 6 months"))
具体来说,如果在任何6个月的日期范围内至少有3个,则进行子集观察。可以使用Cohort_month(因为它是从“日期”列中提取的)可以的
我的DF是:
str(DF)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':
25 obs. of 8 variables:
$ ID : chr "AbDu" "AbDu" "AbDu"
"AbDu" ...
$ Reg : num 29179 32039 35151
38359 41509 ...
$ Date : POSIXct, format: "2017-08-
18" ...
$ Year : num 2017 2017 2017 2017
2017 ...
$ Vol1 : num 2.5 2.5 2.5 2.5 2.5
2.5 4.9 2.5 2.5 4.9 ...
$ Vol2 : num 2.5 2.5 2.5 2.5 2.5
2.5 4.9 2.5 2.5 4.9 ...
$ VolT : num 10 20 20 20 20 ...
$ Cohort_month: num 8 9 10 11 12 1 1 3 4
11 ...
DF
# A tibble: 25 x 8
ID Reg Date Year Vol1 Vol2 VolT
<chr> <dbl> <dttm> <dbl> <dbl> <dbl> <dbl>
AbDu 29179 2017-08-18 00:00:00 2017 2.5 2.5 10
AbDu 32039 2017-09-15 00:00:00 2017 2.5 2.5 20
AbDu 35151 2017-10-13 00:00:00 2017 2.5 2.5 20
AbDu 38359 2017-11-10 00:00:00 2017 2.5 2.5 20
AbDu 41509 2017-12-08 00:00:00 2017 2.5 2.5 20
AbDu 44732 2018-01-08 00:00:00 2018 2.5 2.5 20
AbDu 47487 2018-01-31 00:00:00 2018 4.9 4.9 9.8
AbDu 52537 2018-03-14 00:00:00 2018 2.5 2.5 30
AbDu 57713 2018-05-23 00:00:00 2018 2.5 2.5 30
答案 0 :(得分:0)
尝试以下解决方案:
library(tidyverse)
library(lubridate)
df %>%
group_by(ID) %>%
nest() %>%
mutate(
data_filter = map(
data,
~arrange(.x, Date) %>%
mutate(
Date2 = lag(Date, 2),
MDiff = (difftime(Date, Date2) / 30) %>% as.numeric()
) %>%
filter(MDiff < 6)
),
n_row = map_dbl(
data_filter,
nrow
)
) %>%
filter(n_row > 0) %>%
select(ID, data_filter) %>%
unnest() %>%
select(-MDiff) %>%
pmap_df(
~filter(df, ID == ..1 & Date <= ..2 & Date >= ..3)
)