背景
我有一个带有嵌套列表的数据框
该列表包含医疗诊断。我想将这些诊断信息转换为代码。
每个列表将一个代码提取到主要诊断列中,然后将下一个诊断提取到次要诊断列中,然后提取第三个。每次诊断与列表中的内容匹配时,该元素将从列表中删除。
问题
每次访问列表时,我都必须重复case_,我希望能够引用列表而不是每次都重复
示例:
structure(list(myDx = list("Normal Study", c("There is Barretts with two things",
"everywhere", " Four biopsies taken"
), c("Mitotic lesion", " Barretts",
"Hiatus hernia", "THis was traversible",
"but only just"), "Hiatus Hernia", c("Some Barrett's was seen",
"There were no nodules"), c("HGD seen",
"mitotic lesion"))), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
我的代码
mydf<-mydf %>%
mutate(
PrimaryDiagnosisCode = map(
mydx, ~ case_when(
grepl("mitotic|emr|tumour", tolower(.x)) ~ "C159 - Malignant neoplasm oesophagus, unspecified - Oesophagus - unspecified",
grepl("barrett",tolower(.x),ignore.case=TRUE) ~ "K227 - Barrett's oesophagus",
grepl("hiatus",tolower(.x),ignore.case=TRUE) ~ "K449 - Diaphragmatic hernia without obstruction or gangrene",
TRUE ~ "other")
),
mydx = map(
mydx, ~ .x[!grepl("(mitotic|emr|tumour|dysplasia|hiatus|stricture|barrett|hiatus", tolower(.x),ignore.case=TRUE,perl=TRUE)]
)
)
我必须对secondDiagnosis和ThirdDiagnosis(在mutate中更改名称)重复此操作
这给了我想要的输出,
PrimaryDiagnosis SecondDiagnosis ThirdDiagnosis
K227
C159 K227 K449
K449
K227
C159
我在寻找什么
mydf<-mydf %>%
mutate(
PrimaryDiagnosisCode = map(
mydx, ~ case_when(REFERENCE TO MY GREPS)
),
mydx = map(
mydx, ~ .x[!grepl("(mitotic|emr|tumour|dysplasia|hiatus|stricture|barrett|hiatus", tolower(.x),ignore.case=TRUE,perl=TRUE)]
)
)