Question

我的数据类似于：

PatientID=c(1,1,1,1,2,2,2,3,3,3,3,3)
VisitId=c(1,5,6,9,2,3,12,4,7,8,10,11) 
target=c(0,0,0,1,0,0,0,0,0,0,1,0)

as.data.frame(cbind(PatientID,VisitId,target))

   PatientID VisitId target
1          1       1      0
2          1       5      0
3          1       6      0
4          1       9      1
5          2       2      0
6          2       3      0
7          2      12      0
8          3       4      0
9          3       7      0
10         3       8      0
11         3      10      1
12         3      11      0

我需要删除行，每个PatientID的VisitId等于或大于目标为1的行的VisitId。

即在示例中，应删除第4,11和12行，因为这些是同时或在目标事件发生后为该患者发生的行 - 我希望预测...

Answer 1

这是使用dplyr的想法。这假设您在每个1

中只有1个或没有patientid作为目标

library(dplyr)

df %>% 
 group_by(PatientID) %>% 
 mutate(new = ifelse(target == 1, VisitId, 0), 
        new = replace(new, new == 0, max(new))) %>% 
 filter(target != 1 & VisitId < new | new == 0) %>% 
 select(-new)

给出，

# A tibble: 9 x 3
# Groups:   PatientID [3]
  PatientID VisitId target
      <dbl>   <dbl>  <dbl>
1         1       1      0
2         1       5      0
3         1       6      0
4         2       2      0
5         2       3      0
6         2      12      0
7         3       4      0
8         3       7      0
9         3       8      0

使用其他列中的规则删除行

1 个答案: