如何使用链式ifelse和grepl?

时间:2017-09-27 08:17:16

标签: r

我有一个纯种名称数据库,其结构如下:

HorseName <- c("Grey emperor", "Smokey grey", "Gaining greys", "chestnut", "Glowing Chestnuts", "Ruby red", "My fair lady", "Man of war")
Number <- seq(1:8)
df <- data.frame(HorseName, Number)

我现在希望在每匹马的名字中搜索颜色的出现。具体来说,我希望选择&#39; grey&#39;的所有实例。和&#39;栗子&#39;,创建一个标识这些颜色的新列。任何其他名称可以简单地“其他”#39;不幸的是,名称不一致,包括复数和不同的案例格式。我将如何在R中进行此操作?

我的预期输出是:

df$Type <- c("Grey", "Grey", "Grey", "Chestnut", "Chestnut", "Other", "Other", "Other")

我熟悉链式ifelse语句,但不确定如何处理复数出现和案例敏感性!

2 个答案:

答案 0 :(得分:3)

如果您对其他方法感兴趣,这里有一个tidyverse替代方案,其结果与@ amrrs一样。

library(tidyverse)
library(stringr)

df %>% 
  mutate(Type = str_extract(str_to_lower(HorseName), "grey|chestnut")) %>%
  mutate(Type = str_to_title(if_else(is.na(Type), "other", Type)))
#>           HorseName Number     Type
#> 1      Grey emperor      1     Grey
#> 2       Smokey grey      2     Grey
#> 3     Gaining greys      3     Grey
#> 4          chestnut      4 Chestnut
#> 5 Glowing Chestnuts      5 Chestnut
#> 6          Ruby red      6    Other
#> 7      My fair lady      7    Other
#> 8        Man of war      8    Other

答案 1 :(得分:2)

在使用grepl进行模式匹配之前,将所有输入文本df $ HorseName转换为小写(使用较小的模式)解决了这个问题。

> df$Type <- ifelse(grepl('grey',tolower(df$HorseName)),'Grey',
+                   ifelse(grepl('chestnut',tolower(df$HorseName)),'Chestnut',
+                                'others'))
> df
          HorseName Number     Type
1      Grey emperor      1     Grey
2       Smokey grey      2     Grey
3     Gaining greys      3     Grey
4          chestnut      4 Chestnut
5 Glowing Chestnuts      5 Chestnut
6          Ruby red      6   others
7      My fair lady      7   others
8        Man of war      8   others
>