Question

我有一组具有记录日期和首次就诊日期和疾病状态的患者ID，我想删除所有患者ID （如果它们具有相同的记录日期和首次访问日期）。我的数据集看起来像

p_id Record_date  fvdate     Disease
 12  02-03-2017  02-03-2017   1
 12  05-03-2017  02-03-2017   0
 12  03-04-2018  02-03-2017   1
 11  04-05-2016  05-06-2017   0
 13  18-06-2017  18-06-2017   1
 13  03-05-2018  18-06-2017   0      
 13  09-09-2019  18-06-2017   0
 14  09-12-2017  03-01-2018   1

我需要的输出

p_id  Record_date      fvdate     Disease
11    04-05-2016      05-06-2017    0
14    09-12-2017      03-01-2018    1

提前谢谢

Answer 1

我们可以为first选择Record_date fvdate与p_id不同的组。

library(dplyr)
df %>% group_by(p_id) %>% filter(first(Record_date) != first(fvdate))

#  p_id Record_date fvdate     Disease
#  <int> <fct>       <fct>        <int>
#1    11 04-05-2016  05-06-2017       0
#2    14 09-12-2017  03-01-2018       1

或：

df %>% group_by(p_id) %>% filter(!any(Record_date == first(fvdate)))

数据

df <- structure(list(p_id = c(12L, 12L, 12L, 11L, 13L, 13L, 13L, 14L
), Record_date = c("02-03-2017", "05-03-2017", "03-04-2018", 
"04-05-2016", "18-06-2017", "03-05-2018", "09-09-2019", "09-12-2017"
), fvdate = c("02-03-2017", "02-03-2017", "02-03-2017", "05-06-2017", 
"18-06-2017", "18-06-2017", "18-06-2017", "03-01-2018"), Disease = c(1L, 
0L, 1L, 0L, 1L, 0L, 0L, 1L)), row.names = c(NA, -8L), class = "data.frame")

如果R中满足两个条件，则删除所有ID

1 个答案: