我有下表,并希望根据以下条件进行过滤
首先复制数据:
dt1 <- data.frame(ID = c("a", "a", "a", "a", "a","a","a","a",
"b","b","b","b","b","b","b","b",
"c","c","c","c","c","c","c","c",
"d","d","d","d","d","d","d","d"), value = c(0,0,1,1,2,0,0,1,
1,1,1,2,2,2,2,2,
1,1,1,1,1,3,3,3,
0,2,2,2,2,2,2,3))
现在,我想通过ID创建一个条件,使其满足以下条件:
1)值增加(> = 1)
2)增值保持不变
3)增量的最小开始应在最后3个连续的行内(基本上ID:“ D”不符合条件)
根据上表,只有B和C符合条件
到目前为止,我已经完成了以下操作,但是它对我来说并不起作用,尤其是第三个条件。
dt1 %>% group_by(ID) %>% mutate(change = value -lag(value))
%>% filter(all(change %in% c(2,1,0,NA), na.rm = T))
答案 0 :(得分:2)
一种选择是按“ ID”分组,filter
仅增加相邻元素而没有任何值减小的组,然后过滤“值”的频率大于或等于3的组对于all
元素
library(tidyverse)
library(data.table)
dt1 %>%
group_by(ID) %>%
filter(n_distinct(cumsum(c(1, diff(value) < 0))) == 1) %>%
filter(all(table(rleid(value))>=3))
# A tibble: 16 x 2
# Groups: ID [4]
# ID value
# <fct> <dbl>
# 1 b 1
# 2 b 1
# 3 b 1
# 4 b 2
# 5 b 2
# 6 b 2
# 7 b 2
# 8 b 2
# 9 c 1
#10 c 1
#11 c 1
#12 c 1
#13 c 1
#14 c 3
#15 c 3
#16 c 3