我在R中有一个使用case_when的函数:
myfunction <- function(df, col, case_name, cntl_name) {
object <- df %>%
mutate(
class = case_when(
col == case_name ~ 1,
col == cntl_name ~ 0,
)
)
return(object)
}
所以,如果我有这个对象:
df <- structure(list(id = c("ID1", "ID2",
"ID3", "ID4", "ID5"
), phenotype = c("blue", "blue", "red",
"green", "red"), treatment = c("treat1", "treat2",
"none", "none", "none"), weeks_of_treatment = c(0, 0, 0, 0, 0
)), row.names = c("ID1", "ID2",
"ID3", "ID4", "ID5"
), class = "data.frame")
> df
id phenotype treatment weeks_of_treatment
ID1 ID1 blue treat1 0
ID2 ID2 blue treat2 0
ID3 ID3 red none 0
ID4 ID4 green none 0
ID5 ID5 red none 0
然后运行:
newdf <- myfunction(df, "phenotype", "red", "blue")
它应该返回如下所示的数据框:
id phenotype treatment weeks_of_treatment class
1 ID1 blue treat1 0 0
2 ID2 blue treat2 0 0
3 ID3 red none 0 1
4 ID4 green none 0 NA
5 ID5 red none 0 1
但是没有-它返回以下内容:
> newdf
id phenotype treatment weeks_of_treatment class
1 ID1 blue treat1 0 NA
2 ID2 blue treat2 0 NA
3 ID3 red none 0 NA
4 ID4 green none 0 NA
5 ID5 red none 0 NA
它无法将变量col
识别为列phenotype
。有人知道如何在case_when
中输入动态变量吗?
我已经尝试了dplyr中变量的其他解决方案(例如,在col [[col]]
上使用双括号),但找不到有效的方法。
答案 0 :(得分:1)
myfunction <- function(df, col, case_name, cntl_name) {
object <- df %>%
mutate(
class = case_when(
{{col}} == case_name ~ 1,
{{col}} == cntl_name ~ 0,
)
)
return(object)
}
myfunction(df, phenotype, "red", "blue")
id phenotype treatment weeks_of_treatment class
1 ID1 blue treat1 0 0
2 ID2 blue treat2 0 0
3 ID3 red none 0 1
4 ID4 green none 0 NA
5 ID5 red none 0 1
我个人更喜欢
myfunction <- function(df, col, case_name, cntl_name) {
qCol <- enquo(col)
object <- df %>%
mutate(
class = case_when(
!! qCol == case_name ~ 1,
!! qCol == cntl_name ~ 0,
)
)
return(object)
}
因为它使环境变量和数据帧变量之间的分隔变得明确。
与NSE合作时,我评论中的链接是我的转到页面。