我觉得应该有一种使用dplyr
和case_when
使用contains
变异新列的有效方法,但无法使其工作。
我理解在case_when
中使用mutate
是"有点实验" (如this帖子中所述),但对任何建议都会感激不尽。
不能工作:
library(tidyverse)
set.seed(1234)
x <- c("Black", "Blue", "Green", "Red")
df <- data.frame(a = 1:20,
b = sample(x,20, replace=TRUE))
df <- df %>%
mutate(group = case_when(.$b(contains("Bl")) ~ "Group1",
case_when(.$b(contains("re", ignore.case=TRUE)) ~ "Group2")
)
答案 0 :(得分:8)
我们可以使用grep
df %>%
mutate(group = case_when(grepl("Bl", b) ~ "Group1",
grepl("re", b, ignore.case = TRUE) ~"Group2"))
# a b group
#1 1 Black Group1
#2 2 Green Group2
#3 3 Green Group2
#4 4 Green Group2
#5 5 Red Group2
#6 6 Green Group2
#7 7 Black Group1
#8 8 Black Group1
#9 9 Green Group2
#10 10 Green Group2
#11 1 Green Group2
#12 2 Green Group2
#13 3 Blue Group1
#14 4 Red Group2
#15 5 Blue Group1
#16 6 Red Group2
#17 7 Blue Group1
#18 8 Blue Group1
#19 9 Black Group1
#20 10 Black Group1
答案 1 :(得分:0)
希望通过将str_detect
与paste0
函数一起使用来添加一些示例,这也会使普通组的连接变得困难。假设您正在与gapminder或其他国家的df合作。
interest <- c("Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus",
"Czech Republic", "Denmark", "Estonia", "Finland",
"France", "Germany", "Greece", "Hungary", "Ireland",
"Italy", "Latvia", "Lithuania", "Luxembourg","Malta",
"The Netherlands", "Poland","Portugal", "Romania",
"Slovakia", "Slovenia","Spain", "Sweden","United Kingdom")
EU <- paste0(countrycode::countryname(
sourcevar = interest, destination = "iso2c"),
sep = "|", collapse = "")
df%<>%mutate(Region=case_when(
str_detect(Country, "AT|BE|BG|HR|CY|CZ|DK|EE|FI|FR|DE|GR|HU|IE|
IT|LV|LT|LU|MT|NL|PL|PT|RO|SK|SI|ES|SE|GB|UK|G8")~ "EU",
TRUE ~ "Not EU")) ```
You'll need to load `library(magittr)` to get `%<>%` the compound pipe to work, it's basically an abbreviation of `df<-funs(df)`