我有一个医院记录的数据集,并且必须检查每个患者,如果'肌酐'在48小时内增加> = 0.3,并且如果它增加> = 0.3,则可以看到增加。我的问题是,48小时间隔必须从记录的开始到结束,因为每个间隔都会增加。
示例:
dat = data.table(
patient_id=c(rep(1,7),rep(2,5)),
measurement=c("1","2","3","4","5","6","7","1","2","3","4","5"),
t=c("2019-01-19 05:00","2019-01-19 14:00","2019-01-20 05:00","2019-01-20 15:00","2019-01-21 03:00","2019-01-22 05:00","2019-01-23 05:00","2019-01-19 05:00","2019-01-19 14:00","2019-01-20 05:00","2019-01-20 15:00","2019-01-21 03:00"),
creatinine=c("0.81","0.90","1.00","1.10","1.20","1.30","1.40","0.81","0.90","1.00","1.10","1.20")
)
因此48小时间隔1是测量值1到5。肌酐> = 0.3的第一次增加是从测量1到测量值5。但是间隔1也可能没有增加,所以我有重新检查间隔2(测量2到6)中的增加,依此类推。
我当时正在考虑确定每个间隔的最小值和最大值,并获取两者的差值,这样我就可以确定在此间隔中是否存在> = 0.3的增量。但是,我不知道如何将48小时间隔从记录的开始转移到结束。
我希望我的问题很清楚,非常感谢您的帮助或建议。
答案 0 :(得分:0)
使用zoo::rollapply
,我们可以找到每5个obs的范围,然后减去上层形式的下层形式,以检查其是否> =。3
library(zoo)
library(dplyr)
library(tidyr) #nest and unnest functions
library(lubridate) #ymd and hours function
library(purrr) #map function
dat$t<-ymd_hm(dat$t)
dat$two_days<-dat$t+hours(48)
fun_wdate<-function(df){
#browser()
apply(df,1, function(y){
if(y['measurement']==1){
sum((ymd_hms(y['two_days'])<df$t)=='FALSE')
} else{
sum((ymd_hms(y['two_days'])<df$t[-c(1:y['measurement']-1)])=='FALSE')
}
})
}
dat <- dat %>% group_by(patient_id) %>%
mutate(width=tibble(measurement,t,two_days)%>%fun_wdate)
#Another option
#dat %>% group_by(patient_id) %>%nest() %>% mutate(width=map(data,~fun_wdate(.))) %>% unnest()
dat %>% group_by(patient_id) %>%
mutate(Inc=rollapply(as.numeric(creatinine),width,
FUN=function(x) (if_else (which.min(x)<which.max(x), range(x)[2]-range(x)[1], range(x)[1]-range(x)[2])),
align='left',fill=NA), Flag=if_else(Inc>=0.29999,'Yes','No'))
答案 1 :(得分:0)
这是您要寻找的吗?
library(dplyr)
data %>%
filter(creatinine >= lag(creatinine, 5))
如果您还拥有Patient_id
列,则可以这样做:
library(dplyr)
data %>%
group_by(Patient_id) %>%
filter(creatinine >= lag(creatinine, 5)) %>%
slice(1)
slice(1)
仅获得第一个0.3的增量。