我正在尝试创建一个自定义函数,在现有数据框中生成新的二进制变量。我们的想法是能够使用诊断描述(字符串),ICD9诊断代码(编号)和患者数据库来提供功能。然后,该函数将为所有感兴趣的诊断生成新变量,并且如果患者(行或观察者)具有诊断,则指定0或1。
以下是函数变量:
x<-c("2851") #ICD9 for Anemia
y<-c("diag_1") #Primary diagnosis
z<-"Anemia" #Name of new binary variable for patient dataframe
i<-patient_db #patient dataframe
patient<-c("a","b","c")
diag_1<-c("8661", "2851","8651")
diag_2<-c("8651","8674","2866")
diag_3<-c("2430","3456","9089")
patient_db<-data_frame(patient,diag_1,diag_2,diag_3)
patient diag_1 diag_2 diag_3
1 a 8661 8651 2430
2 b 2851 8674 3456
3 c 8651 2866 9089
以下是功能:
diagnosis_func<-function(x,y,z,i){
pattern = paste("^(", paste0(x, collapse = "|"), ")", sep = "")
i$z<-ifelse(rowSums(sapply(i[y], grepl, pattern = pattern)) != 0,"1","0")
}
这是我在运行该函数后想要得到的:
patient diag_1 diag_2 diag_3 Anemia
1 a 8661 8651 2430 0
2 b 2851 8674 3456 1
3 c 8651 2866 9089 0
函数内的行已经在函数外测试,并且是。我被困在哪里试图让功能正常工作。任何帮助将不胜感激。
新年快乐阿尔比特
答案 0 :(得分:1)
如果您打算一次只使用一个诊断,这将有效。我冒昧地重命名参数,以便在代码中更容易使用。
diagnosis_func <- function(data, target_col, icd, new_col){
pattern <- sprintf("^(%s)",
paste0(icd, collapse = "|"))
data[[new_col]] <- grepl(pattern = pattern,
x = data[[target_col]]) + 0L
data
}
diagnosis_func(patient_db, "diag_1", "2851", "Anemia")
# Multiple codes for a single diagnosis
diagnosis_func(patient_db, "diag_1", c("8661", "8651"), "Dx")
如果您想稍微修改一下以防止意外错误,可以安装checkmate
包并使用此版本。这将
diagnosis_func <- function(data, target_col, icd, new_col){
coll <- checkmate::makeAssertCollection()
checkmate::assert_class(x = data,
classes = "data.frame",
add = coll)
checkmate::assert_character(x = target_col,
len = 1,
add = coll)
checkmate::assert_character(x = icd,
add = coll)
checkmate::assert_character(x = new_col,
len = 1,
add = coll)
checkmate::reportAssertions(coll)
pattern <- sprintf("^(%s)",
paste0(icd, collapse = "|"))
data[[new_col]] <- grepl(pattern = pattern,
x = data[[target_col]]) + 0L
data
}
diagnosis_func(patient_db, "diag_1", "2851", "Anemia")