使用grepl

时间:2019-01-22 19:26:52

标签: r

我正在处理收集的数据,以选择所有适用的数据。数据是按用户单击的顺序收集的,并且用逗号分隔。对于此示例,我仅包括几个可用选项的答案。我需要使用R为每个答案响应创建虚拟变量:

   Program <- c("Admitted, Early Admitted, Other", "Early Admitted, Tuition", "Other, Early Admitted", "Tuition", "Early Head Start")
   df <- as.data.frame(Program)
  > df
                             Program
 1   Admitted, Early Admitted, Other
 2           Early Admitted, Tuition
 3             Other, Early Admitted
 4                           Tuition
 5                    Early Admitted

我想为每个选择创建一个新变量,为1,0。我知道不使用ifelse ==,因为这仅用于精确匹配,所以我转向使用grepl函数-但是我遇到了“已允许”和“最早允许”数据的问题。

df$EarlyAdmitted <- ifelse(grepl("Early Admitted", df$Program),1,0)
df$Admitted <- ifelse(grepl("Admitted", df$Program),1,0)
df$Tuition <- ifelse(grepl("Tuition", df$Program),1,0)
df$Other <- ifelse(grepl("Other", df$Program),1,0)



 > df
                           Program EarlyAdmitted Admitted Tuition Other
1 Admitted, Early Admitted, Other             1        1       0     1
2         Early Admitted, Tuition             1        1       0     0
3           Other, Early Admitted             1        1       0     1
4                         Tuition             0        0       0     0
5                  Early Admitted             0        1       0     0

已录取和早期录取需要被视为两个单独的/唯一的字符串。这就是我希望数据显示的样子:

> df 
                           Program EarlyAdmitted Admitted Tuition Other
1 Admitted, Early Admitted, Other             1        1       0     1
2         Early Admitted, Tuition             1        0       1     0
3           Other, Early Admitted             1        0       0     1
4                         Tuition             0        0       1     0
5                  Early Admitted             1        0       0     0

我尝试对grepl函数使用选项perl,fixed和useBytes参数,但没有任何内容产生上述数据。

我搜索了SO,并找到了以下答案: How to use grep()/gsub() to find exact match,但无法与grepl以及以下答案一起使用:Exact match with grepl R,但解决方案使用==作为完全匹配。

任何想法将不胜感激!

0 个答案:

没有答案