我从一项队列研究中获得了健康数据,并进行了重复测量,在该数据中,每个人每年进行多次随访。在基线(访问0)时,一些人已经被诊断出患有目标疾病,而其他人则没有。当我在分析中查看事件案例时,我需要从数据中删除访问0时被诊断为“病假”的那些人。我如何在tidyverse中做到这一点?我在下面提供了一个我将要查看的数据结构示例:
subject_id <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5)
visit <- c(0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3)
diagnosis <- c("not sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "not sick", "sick")
cohort <- data.frame(subject_id, visit, diagnosis)
cohort
答案 0 :(得分:2)
编辑:如果要完全删除它们,则:
Window
原始
我们可以做到:
cohort %>%
group_by(subject_id) %>%
mutate(Condn = ifelse(visit==0 & diagnosis=="sick",1,0) ) %>%
filter(all(Condn==0))
答案 1 :(得分:1)
使用dplyr
,您可以执行以下操作:
cohort %>%
group_by(subject_id) %>%
filter(first(diagnosis) != "sick")
subject_id visit diagnosis
<dbl> <dbl> <fct>
1 1 0 not sick
2 1 1 not sick
3 1 2 not sick
4 1 3 sick
5 3 0 not sick
6 3 1 not sick
7 3 2 sick
8 3 3 sick
9 5 0 not sick
10 5 1 not sick
11 5 2 not sick
12 5 3 sick
或者:
cohort %>%
group_by(subject_id) %>%
filter(diagnosis[row_number() == 1] != "sick")
答案 2 :(得分:0)
感谢大家的建议。 @tmfmnk和@NelsonGon都提供了适用于此任务的选项。
我最近从SAS转到了R,这非常有帮助。