我正在处理收集的数据,以选择所有适用的数据。数据是按用户单击的顺序收集的,并且用逗号分隔。对于此示例,我仅包括几个可用选项的答案。我需要使用R为每个答案响应创建虚拟变量:
Program <- c("Admitted, Early Admitted, Other", "Early Admitted, Tuition", "Other, Early Admitted", "Tuition", "Early Head Start")
df <- as.data.frame(Program)
> df
Program
1 Admitted, Early Admitted, Other
2 Early Admitted, Tuition
3 Other, Early Admitted
4 Tuition
5 Early Admitted
我想为每个选择创建一个新变量,为1,0。我知道不使用ifelse ==,因为这仅用于精确匹配,所以我转向使用grepl函数-但是我遇到了“已允许”和“最早允许”数据的问题。
df$EarlyAdmitted <- ifelse(grepl("Early Admitted", df$Program),1,0)
df$Admitted <- ifelse(grepl("Admitted", df$Program),1,0)
df$Tuition <- ifelse(grepl("Tuition", df$Program),1,0)
df$Other <- ifelse(grepl("Other", df$Program),1,0)
> df
Program EarlyAdmitted Admitted Tuition Other
1 Admitted, Early Admitted, Other 1 1 0 1
2 Early Admitted, Tuition 1 1 0 0
3 Other, Early Admitted 1 1 0 1
4 Tuition 0 0 0 0
5 Early Admitted 0 1 0 0
已录取和早期录取需要被视为两个单独的/唯一的字符串。这就是我希望数据显示的样子:
> df
Program EarlyAdmitted Admitted Tuition Other
1 Admitted, Early Admitted, Other 1 1 0 1
2 Early Admitted, Tuition 1 0 1 0
3 Other, Early Admitted 1 0 0 1
4 Tuition 0 0 1 0
5 Early Admitted 1 0 0 0
我尝试对grepl函数使用选项perl,fixed和useBytes参数,但没有任何内容产生上述数据。
我搜索了SO,并找到了以下答案: How to use grep()/gsub() to find exact match,但无法与grepl以及以下答案一起使用:Exact match with grepl R,但解决方案使用==作为完全匹配。
任何想法将不胜感激!