根据基线特征从队列研究数据中删除个人

时间:2019-06-27 16:23:35

标签: r tidyverse

我从一项队列研究中获得了健康数据,并进行了重复测量,在该数据中,每个人每年进行多次随访。在基线(访问0)时,一些人已经被诊断出患有目标疾病,而其他人则没有。当我在分析中查看事件案例时,我需要从数据中删除访问0时被诊断为“病假”的那些人。我如何在tidyverse中做到这一点?我在下面提供了一个我将要查看的数据结构示例:

subject_id <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5)
visit <- c(0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3)
diagnosis <- c("not sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "not sick", "sick")

cohort <- data.frame(subject_id, visit, diagnosis)
cohort

3 个答案:

答案 0 :(得分:2)

编辑:如果要完全删除它们,则:

Window

原始

我们可以做到:

cohort %>% 
  group_by(subject_id) %>% 
  mutate(Condn = ifelse(visit==0 & diagnosis=="sick",1,0) ) %>% 
  filter(all(Condn==0))

答案 1 :(得分:1)

使用dplyr,您可以执行以下操作:

cohort %>%
 group_by(subject_id) %>%
 filter(first(diagnosis) != "sick")

   subject_id visit diagnosis
        <dbl> <dbl> <fct>    
 1          1     0 not sick 
 2          1     1 not sick 
 3          1     2 not sick 
 4          1     3 sick     
 5          3     0 not sick 
 6          3     1 not sick 
 7          3     2 sick     
 8          3     3 sick     
 9          5     0 not sick 
10          5     1 not sick 
11          5     2 not sick 
12          5     3 sick   

或者:

cohort %>%
 group_by(subject_id) %>%
 filter(diagnosis[row_number() == 1] != "sick")

答案 2 :(得分:0)

感谢大家的建议。 @tmfmnk和@NelsonGon都提供了适用于此任务的选项。

我最近从SAS转到了R,这非常有帮助。