Question

这是我的示例数据。

index <- c(1,2,3,4,5,6,7,8,9,10)
a <- c('a','b','c',NA,'D','e',NA,'g','h','i')
data <- data.frame(index,a)

我想创建一个新的列名，仅保留“ a”和“ b”。所有其他字符，例如'c'，'d'，'e'...都将被标记为其他字符，而NA则保留为NA。

data$name = ifelse(!grepl('(a|b)',data$a),'others',data$name)

我尝试使用grepl函数，但似乎无法使用缺少值的数据

Answer 1

在基数R中：

data$res <- as.character(data$a)
data$res[! data$a %in% c("a","b") & !is.na(data$a)] <- "Other"
data
#    index    a   res
# 1      1    a     a
# 2      2    b     b
# 3      3    c Other
# 4      4 <NA>  <NA>
# 5      5    D Other
# 6      6    e Other
# 7      7 <NA>  <NA>
# 8      8    g Other
# 9      9    h Other
# 10    10    i Other

请注意，此处新列的类型为character。

Answer 2

使用id d v1 v2 v3 1 [1,2,3] 4 5 6 2 [4,5,6] 4 5 6及其dplyr功能，您可以

recode

对于更复杂的匹配，您改为使用data %>% mutate(name=recode(a, a="a", b="b", .default="other")) # index a name # 1 1 a a # 2 2 b b # 3 3 c other # 4 4 <NA> <NA> # 5 5 D other # 6 6 e other # 7 7 <NA> <NA> # 8 8 g other # 9 9 h other # 10 10 i other

case_when

合并具有缺失值的字符串

2 个答案: