如何将多个虚拟变量转换为1个因子变量?

时间:2018-03-30 12:13:59

标签: r dummy-variable

我从调查中获得了大量数据集,其中包含大量虚拟变量的语句。每个虚拟元素都是具有“引用”和“未引用”等级的因子。由于不同的语句组属于同一主题,我想将它们转换为1个更大的因子变量,将虚拟变量作为级别,并且值保持“引用”和“未引用”(或1和0,不此刻的事情)。

所以我现在从2个虚拟变量看起来像这样:

    pp_plan_thoughtAWhile   pp_plan_justHappen  
     not quoted                  not quoted 
     not quoted                  not quoted 
     not quoted                  not quoted 
     not quoted                  not quoted 
     not quoted                  quoted 
     quoted                      quoted 

我需要它看起来像这样:

               #plan 
      ## value     thoughtAWhile    justHappen
           0           350             550  
           1           650             450

有谁知道怎么做?任何帮助都将受到高度赞赏,我正在努力!

2 个答案:

答案 0 :(得分:2)

我们可以使用gather将数据集重新整形为“long”格式,然后将countspread的频率设为“宽”格式

library(tidyverse)
gather(df1) %>%
   count(key, value) %>%
   spread(key, n)

答案 1 :(得分:0)

这是一种方法。

数据

   pp_plan_thoughtAWhile <-  sample(c("Quoted", "NotQuoted"), 10, replace = T, prob=c(0.7, 0.3))
   pp_plan_justHappen  <- sample(c("Quoted", "NotQuoted"), 10, replace = T, prob=c(0.5, 0.5))
   dv <- data.frame(pp_plan_justHappen, pp_plan_thoughtAWhile)

部分处理

dv$pp_plan_justHappen <- as.factor (dv$pp_plan_justHappen) 
dv$pp_plan_thoughtAWhile <- as.factor(dv$pp_plan_thoughtAWhile)

library(reshape2)
mdata <- melt(dv)

mdata$bin_plan_justhappen <- ifelse(mdata$pp_plan_justHappen=="Quoted", 1, 0)
mdata$bin_plan_thoughtwhile <- ifelse(mdata$pp_plan_thoughtAWhile=="Quoted", 1, 0)
library(plyr)
table(mdata$bin_plan_justhappen, mdata$bin_plan_thoughtwhile)
plyr::count(mdata, c("bin_plan_justhappen", "bin_plan_thoughtwhile"))

<强>结果

bin_plan_justhappen bin_plan_thoughtwhile freq
               0                     1    2
               1                     0    1
               1                     1    7