从具有多个值的变量创建虚拟变量 r

时间:2021-07-21 10:18:08

标签: r

我在下面有一个数据框,我想在其中添加一个新列,如果说的语言是英语,则为 1 else 0

language_spoken
         Jap;Fre
         Jap;Fre
         Fre;Ch
         Eng
         Eng;Jap
         Hindi;Eng
               
         Eng;Spanish;Fre
         Spanish;Jap
         Spanish

最终数据框

      language_spoken   Eng
         Jap;Fre         0
         Jap;Fre         0
         Fre;Ch          0
         Eng             1
         Eng;Jap         1
         Hindi;Eng       1
                         0
        Eng;Spanish;Fre  1
        Spanish;Jap      0
        Spanish          0

我在下面尝试过,但它不起作用

     b <- data.frame(model.matrix(~.-1,data))
     b

以下示例数据集:

     data <- data.frame(language_spoken = c("Jap;Fre","Jap;Fre","Fre;Ch","Eng","Eng;Jap","Hindi;Eng","","Eng;Spanish;Fre","Spanish;Jap","Spanish"))

2 个答案:

答案 0 :(得分:2)

在基础 R 中,这将起作用:

data$Eng <- as.integer(grepl("Eng", data$language_spoken))

data
#>    language_spoken Eng
#> 1          Jap;Fre   0
#> 2          Jap;Fre   0
#> 3           Fre;Ch   0
#> 4              Eng   1
#> 5          Eng;Jap   1
#> 6        Hindi;Eng   1
#> 7                    0
#> 8  Eng;Spanish;Fre   1
#> 9      Spanish;Jap   0
#> 10         Spanish   0

这将是一种整洁的方法:

library(dplyr)
library(stringr)

data %>%
  mutate(Eng = as.numeric(str_detect(language_spoken, "Eng")))

#>    language_spoken Eng
#> 1          Jap;Fre   0
#> 2          Jap;Fre   0
#> 3           Fre;Ch   0
#> 4              Eng   1
#> 5          Eng;Jap   1
#> 6        Hindi;Eng   1
#> 7                    0
#> 8  Eng;Spanish;Fre   1
#> 9      Spanish;Jap   0
#> 10         Spanish   0

reprex package (v0.3.0) 于 2021 年 7 月 21 日创建

答案 1 :(得分:1)

这行得通吗:

library(dplyr)
df %>% mutate(End = +grepl('Eng',language_spoken))
            language_spoken End
1                   Jap;Fre   0
2                   Jap;Fre   0
3                    Fre;Ch   0
4                       Eng   1
5                   Eng;Jap   1
6                 Hindi;Eng   1
7                             0
8           Eng;Spanish;Fre   1
9               Spanish;Jap   0
10                  Spanish   0