按2列以上的因子折叠数据框

时间:2019-06-21 00:05:53

标签: r tidyverse plyr

我看过多个帖子,但找不到针对我的问题的解决方案。

我有两个因子列(Quarter和Hour.of.Work),并且希望同时折叠两个因子列,但是我很难使fct_collapse命令在多个列和因子上起作用。

我尝试过:

HW2 <- HW %>% 
        mutate(Hours.of.Work = 
                 fct_collapse(Hours.of.Work,
                              "0" = "0",
                              "1-19" = c("1-9", "10-19"),
                              "20-39" = c("20-29", "30-34", "35-39"),
                              "40" = "40",
                              "41-49" = c("41-44", "45-49"),
                              "50+" = "50+"),
               Hours.of.Work = factor(Hours.of.Work, 
                                      levels = c("0", "1-19", "20-39", "40", "41-49", "50+"))) 

但收到以下错误:Unknown levels in f : 1-9, 10-19, 20-29, 30-34, 35-39, 40, 41-44, 45-49

数据如下:

dput(head(HW,20))
structure(list(Quarter = structure(c(49L, 49L, 49L, 49L, 49L, 
49L, 49L, 49L, 49L, 49L, 48L, 48L, 48L, 48L, 48L, 48L, 48L, 48L, 
48L, 48L), .Label = c("2007Q1", "2007Q2", "2007Q3", "2007Q4", 
"2008Q1", "2008Q2", "2008Q3", "2008Q4", "2009Q1", "2009Q2", "2009Q3", 
"2009Q4", "2010Q1", "2010Q2", "2010Q3", "2010Q4", "2011Q1", "2011Q2", 
"2011Q3", "2011Q4", "2012Q1", "2012Q2", "2012Q3", "2012Q4", "2013Q1", 
"2013Q2", "2013Q3", "2013Q4", "2014Q1", "2014Q2", "2014Q3", "2014Q4", 
"2015Q1", "2015Q2", "2015Q3", "2015Q4", "2016Q1", "2016Q2", "2016Q3", 
"2016Q4", "2017Q1", "2017Q2", "2017Q3", "2017Q4", "2018Q1", "2018Q2", 
"2018Q3", "2018Q4", "2019Q1"), class = "factor"), Hours.of.Work = structure(c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L), .Label = c("0", "1-9 ", "10-19 ", "20-29 ", 
"30-34 ", "35-39 ", "40 ", "41-44 ", "45-49 ", "50+"), class = "factor"), 
    Population = c(144.7, 132.7, 184.5, 244.3, 232.4, 163.1, 
    631.3, 94.1, 218.9, 408.3, 157.4, 129.6, 186.2, 233.6, 190, 
    161.7, 698.9, 85.6, 215.7, 413.4)), row.names = c(NA, 20L
), class = "data.frame")

因此,我希望按各个季度(例如2019Q1、2018Q4、2019Q3等)将Hours.of.Work的因素从c("0", "1-9 ", "10-19 ", "20-29 ", "30-34 ", "35-39 ", "40 ", "41-44 ", "45-49 ", "50+")折叠到c("0", "1-19", "20-39", "40", "41-49", "50+")

非常感谢您提供的所有帮助!

编辑: 更新的代码:

HW2 <- HW %>% 
        mutate(Hours.of.Work = 
                 fct_collapse(Hours.of.Work,
                              "0" = "0",
                              "1-19" = c("1-9 ", "10-19 "),
                              "20-39" = c("20-29 ", "30-34 ", "35-39 "),
                              "40" = "40 ",
                              "41-49" = c("41-44 ", "45-49 "),
                              "50+" = "50+"),
               Hours.of.Work = factor(Hours.of.Work, 
                                      levels = c("0", "1-19", "20-39", "40", "41-49", "50+")))

我再也没有遇到Unknown levels错误,但是上面的代码导致我的因素汇总错误

生成的Table

0 个答案:

没有答案