根据向量中的特定条件进行过滤

时间:2019-05-14 18:13:58

标签: r

我有下表,并希望根据以下条件进行过滤

首先复制数据:

dt1 <- data.frame(ID = c("a", "a", "a", "a", "a","a","a","a",
                     "b","b","b","b","b","b","b","b",
                     "c","c","c","c","c","c","c","c",
                     "d","d","d","d","d","d","d","d"), value = c(0,0,1,1,2,0,0,1,
                                                                 1,1,1,2,2,2,2,2,
                                                                 1,1,1,1,1,3,3,3,
                                                                 0,2,2,2,2,2,2,3))

现在,我想通过ID创建一个条件,使其满足以下条件:

1)值增加(> = 1)
2)增值保持不变
3)增量的最小开始应在最后3个连续的行内(基本上ID:“ D”不符合条件)

根据上表,只有B和C符合条件

到目前为止,我已经完成了以下操作,但是它对我来说并不起作用,尤其是第三个条件。

dt1 %>% group_by(ID) %>% mutate(change = value -lag(value)) 
%>% filter(all(change %in% c(2,1,0,NA), na.rm = T))

1 个答案:

答案 0 :(得分:2)

一种选择是按“ ID”分组,filter仅增加相邻元素而没有任何值减小的组,然后过滤“值”的频率大于或等于3的组对于all元素

library(tidyverse)
library(data.table)
dt1 %>%
   group_by(ID) %>% 
   filter(n_distinct(cumsum(c(1, diff(value)  < 0))) == 1) %>%        
   filter(all(table(rleid(value))>=3))
# A tibble: 16 x 2
# Groups:   ID [4]
#   ID    value
#   <fct> <dbl>
# 1 b         1
# 2 b         1
# 3 b         1
# 4 b         2
# 5 b         2
# 6 b         2
# 7 b         2
# 8 b         2
# 9 c         1
#10 c         1
#11 c         1
#12 c         1
#13 c         1
#14 c         3
#15 c         3
#16 c         3