使用str_detect和dplyr创建分组的类别

时间:2019-06-26 02:46:25

标签: r dplyr

问题

我正在考虑为类别(Type)创建自定义组。我能够做到 使用str_detect中的mutatedplyr。但是,有没有更简单的方法来进行分组? ifelse中的mutate语句似乎很笨拙,需要大量键入。

谢谢!

可复制示例

data <- data.frame('Type' = c("Organ Failure", "Drowning", "Coronary Disease", "Accident"), "No" = c(3, 1, 2, 4))

              Type No
1    Organ Failure  3
2         Drowning  1
3 Coronary Disease  2
4         Accident  4

预期产量

              Type No  Grouped Type
1    Organ Failure  3 Health Issues
2         Drowning  1      Accident
3 Coronary Disease  2 Health Issues
4         Accident  4      Accident

用于产生上述输出的代码

data %>% mutate('Grouped Type' = ifelse(str_detect(data$Type, 'Organ|Coronary'), "Health Issues", 
                                        ifelse(str_detect(data$Type, 'Drown|Accident'), "Accident", 0))) 

2 个答案:

答案 0 :(得分:1)

不确定这种输入方式是否少,但是您可以尝试case_when,它更干净,更容易理解。

library(tidyverse)
data %>%
   mutate(`Grouped Type` = case_when(
           str_detect(Type, 'Organ|Coronary') ~ "Health Issues",
           str_detect(Type, 'Drown|Accident') ~ "Accident", 
           TRUE ~ NA_character_))

#              Type No  Grouped Type
#1    Organ Failure  3 Health Issues
#2         Drowning  1      Accident
#3 Coronary Disease  2 Health Issues
#4         Accident  4      Accident

也无需在$内使用mutate

答案 1 :(得分:1)

我们可以使用fuzzyjoin来执行此操作,而不必使用多个ifelse。创建键/值数据集,然后加入regex_left_join

library(fuzzyjoin)
keydat <- data.frame(Type = c("Organ", "Coronary", "Drown", "Accident"), 
      Grouped_Type = c("Health Issues", "Health Issues", "Accident", "Accident"))
regex_left_join(data, keydat) %>% 
        select(Type = Type.x, No, Grouped_Type)
#            Type No  Grouped_Type
#1    Organ Failure  3 Health Issues
#2         Drowning  1      Accident
#3 Coronary Disease  2 Health Issues
#4         Accident  4      Accident