我看过多个帖子,但找不到针对我的问题的解决方案。
我有两个因子列(Quarter和Hour.of.Work),并且希望同时折叠两个因子列,但是我很难使fct_collapse命令在多个列和因子上起作用。
我尝试过:
HW2 <- HW %>%
mutate(Hours.of.Work =
fct_collapse(Hours.of.Work,
"0" = "0",
"1-19" = c("1-9", "10-19"),
"20-39" = c("20-29", "30-34", "35-39"),
"40" = "40",
"41-49" = c("41-44", "45-49"),
"50+" = "50+"),
Hours.of.Work = factor(Hours.of.Work,
levels = c("0", "1-19", "20-39", "40", "41-49", "50+")))
但收到以下错误:Unknown levels in
f : 1-9, 10-19, 20-29, 30-34, 35-39, 40, 41-44, 45-49
数据如下:
dput(head(HW,20))
structure(list(Quarter = structure(c(49L, 49L, 49L, 49L, 49L,
49L, 49L, 49L, 49L, 49L, 48L, 48L, 48L, 48L, 48L, 48L, 48L, 48L,
48L, 48L), .Label = c("2007Q1", "2007Q2", "2007Q3", "2007Q4",
"2008Q1", "2008Q2", "2008Q3", "2008Q4", "2009Q1", "2009Q2", "2009Q3",
"2009Q4", "2010Q1", "2010Q2", "2010Q3", "2010Q4", "2011Q1", "2011Q2",
"2011Q3", "2011Q4", "2012Q1", "2012Q2", "2012Q3", "2012Q4", "2013Q1",
"2013Q2", "2013Q3", "2013Q4", "2014Q1", "2014Q2", "2014Q3", "2014Q4",
"2015Q1", "2015Q2", "2015Q3", "2015Q4", "2016Q1", "2016Q2", "2016Q3",
"2016Q4", "2017Q1", "2017Q2", "2017Q3", "2017Q4", "2018Q1", "2018Q2",
"2018Q3", "2018Q4", "2019Q1"), class = "factor"), Hours.of.Work = structure(c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L), .Label = c("0", "1-9 ", "10-19 ", "20-29 ",
"30-34 ", "35-39 ", "40 ", "41-44 ", "45-49 ", "50+"), class = "factor"),
Population = c(144.7, 132.7, 184.5, 244.3, 232.4, 163.1,
631.3, 94.1, 218.9, 408.3, 157.4, 129.6, 186.2, 233.6, 190,
161.7, 698.9, 85.6, 215.7, 413.4)), row.names = c(NA, 20L
), class = "data.frame")
因此,我希望按各个季度(例如2019Q1、2018Q4、2019Q3等)将Hours.of.Work
的因素从c("0", "1-9 ", "10-19 ", "20-29 ", "30-34 ", "35-39 ", "40 ", "41-44 ", "45-49 ", "50+")
折叠到c("0", "1-19", "20-39", "40", "41-49", "50+")
。
非常感谢您提供的所有帮助!
编辑: 更新的代码:
HW2 <- HW %>%
mutate(Hours.of.Work =
fct_collapse(Hours.of.Work,
"0" = "0",
"1-19" = c("1-9 ", "10-19 "),
"20-39" = c("20-29 ", "30-34 ", "35-39 "),
"40" = "40 ",
"41-49" = c("41-44 ", "45-49 "),
"50+" = "50+"),
Hours.of.Work = factor(Hours.of.Work,
levels = c("0", "1-19", "20-39", "40", "41-49", "50+")))
我再也没有遇到Unknown levels
错误,但是上面的代码导致我的因素汇总错误
生成的Table