我的数据帧如下所示:
dat <- data.frame(QuarterYear = c("Q4 2019", "Q4 2019", "Q4 2019",
"Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019",
"Q4 2019", "Q4 2019", "Q4 2019", "Q1 2020", "Q1 2020", "Q1 2020",
"Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020",
"Q1 2020", "Q1 2020", "Q1 2020", "Q2 2020", "Q2 2020", "Q2 2020",
"Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020",
"Q2 2020", "Q2 2020", "Q2 2020", "Q3 2020", "Q3 2020", "Q3 2020",
"Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020",
"Q3 2020", "Q3 2020", "Q3 2020"),
Grade = c("Grade 8", "Grade 8",
"Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10",
"Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8",
"Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10",
"Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8",
"Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10",
"Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8",
"Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10",
"Grade 10", "Grade 11", "Grade 11", "Grade 11"),
Type = c("overallAverage",
"CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT",
"RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT",
"overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage",
"CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT",
"RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT",
"overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage",
"CT", "RT", "overallAverage", "CT", "RT"),
value = c(2.48, 2.21,
0.27, 3.48, 3.03, 0.45, 4.6, 4, 0.6, 2.8, 2.4, 0.4, 2.54, 2.28,
0.26, 3.45, 3, 0.45, 4.46, 3.88, 0.58, 3.56, 2.81, 0.75, 2.47,
2.14, 0.33, 2.96, 2.54, 0.41, 4.1, 3.69, 0.41, 3.44, 2.61, 0.83,
2, 1.81, 0.19, 2.54, 2.26, 0.28, 4.11, 3.68, 0.43, 2.67, 2.11,
0.56), stringsAsFactors = FALSE)
我正在尝试将此数据框重塑为宽格式,其中Type
的唯一值将是行,并且将基于QuarterYear
和Grade
填充值。
简单地说,如果第一行是OverallAverage
,则前4列将代表Q4 2019-Grade 8
至Q3 2020- Grade 8
。接下来的4列将用于Q4 2019-Grade 9
至Q3 2020-Grade 9
,依此类推。
我尝试使用reshape
函数
widerDat <- reshape(dat, direction = "wide",idvar = "Type",timevar = "value")
如何结合QuarterYear
和Grade
以获得所需的输出?
请帮助我找到合适的解决方案。预先感谢!
答案 0 :(得分:1)
您可以Exception
一起使用时间变量,并将其用作单个paste
变量,如下所示:
time=
要按所需格式对列进行排序,我们可以使用res <- reshape(transform(dat, time=paste(QuarterYear, Grade)),
direction="wide", idvar="Type", timevar="time",
drop=c("QuarterYear", "Grade"))
res
# Type value.Q4 2019 Grade 8 value.Q4 2019 Grade 9
# 1 overallAverage 2.48 3.48
# 2 CT 2.21 3.03
# 3 RT 0.27 0.45
# value.Q4 2019 Grade 10 value.Q4 2019 Grade 11 value.Q1 2020 Grade 8
# 1 4.6 2.8 2.54
# 2 4.0 2.4 2.28
# 3 0.6 0.4 0.26
# value.Q1 2020 Grade 9 value.Q1 2020 Grade 10 value.Q1 2020 Grade 11
# 1 3.45 4.46 3.56
# 2 3.00 3.88 2.81
# 3 0.45 0.58 0.75
# value.Q2 2020 Grade 8 value.Q2 2020 Grade 9 value.Q2 2020 Grade 10
# 1 2.47 2.96 4.10
# 2 2.14 2.54 3.69
# 3 0.33 0.41 0.41
# value.Q2 2020 Grade 11 value.Q3 2020 Grade 8 value.Q3 2020 Grade 9
# 1 3.44 2.00 2.54
# 2 2.61 1.81 2.26
# 3 0.83 0.19 0.28
# value.Q3 2020 Grade 10 value.Q3 2020 Grade 11
# 1 4.11 2.67
# 2 3.68 2.11
# 3 0.43 0.56
。
substr
答案 1 :(得分:1)
我认为这样做
library(tidyverse)
wider_data <- dat %>% mutate(new_col = paste(Grade,QuarterYear, sep = " ")) %>%
select(Type, new_col, value) %>%
pivot_wider(names_from = new_col, values_from = value)
要手动重新排列列,请使用此
wider_data <- wider_data %>% select(1,2,6,10,14,3,7,11,15,4,8,12,16,5,9,13,17)