我想在dplyr中使用case_when()
来创建一个新的分类列,该列显示一个人在培训中的当前状态。
我有一个类似这样的小标题:
library(dplyr)
problem <- tibble(name = c("Angela", "Claire", "Justin"),
status_1 = c("Registered", "No Action", "Completed"),
status_2 = c("Withdrawn", "No Action", "Registered"),
status_3 = c("No Action", "Registered", "Withdrawn"))
如果此人曾经完成过课程,则应该完成其身份(即使他们以后不小心再次注册了课程,本例中的贾斯汀也提供了证据)。如果他们尚未完成课程,则应注册其状态,并且以后的状态都不能撤消,例如“不采取任何措施”或“已撤回”。什么也没有,或者比他们注册晚了。
在此示例中,最终数据集应如下所示:
library(dplyr)
solution <- tibble(name = c("Angela", "Claire", "Justin"),
status_1 = c("Registered", "No Action", "Completed"),
status_2 = c("Withdrawn", "No Action", "Registered"),
status_3 = c("No Action", "Registered", "Withdrawn"),
current = c("Not Taken", "Registered", "Completed"))
Justin完成了,因为他在任何时候都完成了课程。不接受安吉拉是因为她取消了注册。克莱尔之所以被注册,是因为她的状态最远。
这是我到目前为止所拥有的。它正确地分类了贾斯汀和克莱尔,但错误地将了安吉拉。我知道为什么它对她的分类不正确,但是我不知道如何进行注册,因为这涉及到查找后面的数字,并且R正确地将变量名视为一个字符。
library(dplyr)
library(purrr)
solution <- problem %>%
mutate(current_status = pmap_chr(select(., contains("status")), ~
case_when(any(str_detect(c(...), "(?i)Completed")) ~ "Completed",
any(str_detect(c(...), "(?i)Registered")) ~ "Registered",
any(str_detect(c(...), "(?i)No Action")) | any(str_detect(c(...), "(?i)Withdrawn")) ~ "Not Taken",
TRUE ~ "NA")))
谢谢!
答案 0 :(得分:3)
这是使用apply
和case_when
的一种方法。 apply
一次遍历problem
的所有行,并根据case_when
条件计算结果。
problem %>%
mutate(
current =
apply(select(., starts_with("status")), 1, function(x) {
case_when(
"Completed" %in% x ~ "Completed",
which.max(x=="Registered") > which.max(x %in% c("No Action","Withdrawn")) ~ "Registered",
TRUE ~ "Not Taken"
)
})
)
# A tibble: 3 x 5
name status_1 status_2 status_3 current
<chr> <chr> <chr> <chr> <chr>
1 Angela Registered Withdrawn No Action Not Taken
2 Claire No Action No Action Registered Registered
3 Justin Completed Registered Withdrawn Completed
在管道外,您只需执行-
problem$current <- select(problem, starts_with("status")) %>%
apply(., 1, function(x) {
case_when(
"Completed" %in% x ~ "Completed",
which.max(x == "Registered") > which.max(x %in% c("No Action", "Withdrawn")) ~ "Registered",
TRUE ~ "Not Taken"
)
})