我有一份职业清单,希望将它们与更大类别的职业相匹配。为此,我有一个表格,其中包含模式列表和匹配的职业类别。例如,
REGEX SuperCategory
cleaner Cleaners
(police)|(paramedic)|(emergency) Emergency services
(manager)|(director)|(executive) Management
structure(list(REGEX = c("cleaner", "(police)|(paramedic)|(emergency)"
), SuperCategory = c("Cleaners", NA)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -2L), .Names = c("REGEX", "SuperCategory"
))
我想在另一个数据框架中创建一个新列,这是SuperCategory的元素,其中匹配相应行的REGEX模式。
我现在正在做的是一系列ifelse
语句与grepl
的结合。
df %>%
mutate(SuperOccupation = ifelse(grepl(regex_decoder$REGEX[1], Occupation1, ignore.case=TRUE),
regex_decoder$SuperCategory[1],
ifelse(grepl(regex_decoder$REGEX[2],
Occupation1,
ignore.case=TRUE),
regex_decoder$SuperCategory[2],
ifelse(grepl(regex_decoder$REGEX[3],
Occupation1,
ignore.case=TRUE),
regex_decoder$SuperCategory[3],
NA_character_))))
但我必须继续为我的桌子长度筑巢。有没有办法加入"通过正则表达式?