如何根据特定序列中出现的值对列进行突变?

时间:2020-08-04 16:05:10

标签: r data-manipulation mutate

我有一个数据帧df

  df <- data.frame(ID = c(1,1,1,2,2,2,3,3,3,4,4,4,4),  process = c("inspection", "evaluation", "result","inspection", "result", "evaluation", "result", "inspection","result","evaluation","result","result","evaluation"))

我需要插入一列true_process,以便如果evaluation在特定result的{​​{1}}之前,那么它就是ID。如果它紧随其后或丢失,则应采用值true

我尝试过的代码。

false

预期输出如下

library(dplyr)
df %>% 
    group_by(ID) %>% 
    mutate(true_process = case_when(
        !any(process == "evaluation") ~ "False",
        length(process == "evaluation")[[1]] > length(process == "result")[[1]] ~ "False",
        TRUE ~ "True"
    )) 
# A tibble: 13 x 3
# Groups:   ID [4]
      ID process    true_process
   <dbl> <fct>      <chr>       
 1     1 inspection True        
 2     1 evaluation True        
 3     1 result     True        
 4     2 inspection True        
 5     2 result     True        
 6     2 evaluation True        
 7     3 result     False       
 8     3 inspection False       
 9     3 result     False       
10     4 evaluation True        
11     4 result     True        
12     4 result     True        
13     4 evaluation True 

1 个答案:

答案 0 :(得分:3)

根据更新后的数据,您可以检查evaluation的最后一个实例的索引是否小于result的任何索引。

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(true_process = any(tail(which(process == "evaluation"), 1) < which(process == "result")))


# A tibble: 13 x 3
# Groups:   ID [4]
      ID process    true_process
   <dbl> <chr>      <lgl>       
 1     1 inspection TRUE        
 2     1 evaluation TRUE        
 3     1 result     TRUE        
 4     2 inspection FALSE       
 5     2 result     FALSE       
 6     2 evaluation FALSE       
 7     3 result     FALSE       
 8     3 inspection FALSE       
 9     3 result     FALSE       
10     4 evaluation FALSE       
11     4 result     FALSE       
12     4 result     FALSE       
13     4 evaluation FALSE