基于两列将长格式数据框重塑为宽

时间:2020-10-10 08:01:34

标签: r dataframe reshape

我的数据帧如下所示:

dat <- data.frame(QuarterYear = c("Q4 2019", "Q4 2019", "Q4 2019", 
                              "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", 
                              "Q4 2019", "Q4 2019", "Q4 2019", "Q1 2020", "Q1 2020", "Q1 2020", 
                              "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", 
                              "Q1 2020", "Q1 2020", "Q1 2020", "Q2 2020", "Q2 2020", "Q2 2020", 
                              "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", 
                              "Q2 2020", "Q2 2020", "Q2 2020", "Q3 2020", "Q3 2020", "Q3 2020", 
                              "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", 
                              "Q3 2020", "Q3 2020", "Q3 2020"), 
              Grade = c("Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11"), 
              Type = c("overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", 
                       "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", 
                       "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", 
                       "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", 
                       "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT"), 
              value = c(2.48, 2.21, 
                        0.27, 3.48, 3.03, 0.45, 4.6, 4, 0.6, 2.8, 2.4, 0.4, 2.54, 2.28, 
                        0.26, 3.45, 3, 0.45, 4.46, 3.88, 0.58, 3.56, 2.81, 0.75, 2.47, 
                        2.14, 0.33, 2.96, 2.54, 0.41, 4.1, 3.69, 0.41, 3.44, 2.61, 0.83, 
                        2, 1.81, 0.19, 2.54, 2.26, 0.28, 4.11, 3.68, 0.43, 2.67, 2.11, 
                        0.56), stringsAsFactors = FALSE)

我正在尝试将此数据框重塑为宽格式,其中Type的唯一值将是行,并且将基于QuarterYearGrade填充值。

简单地说,如果第一行是OverallAverage,则前4列将代表Q4 2019-Grade 8Q3 2020- Grade 8。接下来的4列将用于Q4 2019-Grade 9Q3 2020-Grade 9,依此类推。

我尝试使用reshape函数

widerDat <- reshape(dat, direction = "wide",idvar = "Type",timevar = "value")  

如何结合QuarterYearGrade以获得所需的输出?

请帮助我找到合适的解决方案。预先感谢!

2 个答案:

答案 0 :(得分:1)

您可以Exception一起使用时间变量,并将其用作单个paste变量,如下所示:

time=

要按所需格式对列进行排序,我们可以使用res <- reshape(transform(dat, time=paste(QuarterYear, Grade)), direction="wide", idvar="Type", timevar="time", drop=c("QuarterYear", "Grade")) res # Type value.Q4 2019 Grade 8 value.Q4 2019 Grade 9 # 1 overallAverage 2.48 3.48 # 2 CT 2.21 3.03 # 3 RT 0.27 0.45 # value.Q4 2019 Grade 10 value.Q4 2019 Grade 11 value.Q1 2020 Grade 8 # 1 4.6 2.8 2.54 # 2 4.0 2.4 2.28 # 3 0.6 0.4 0.26 # value.Q1 2020 Grade 9 value.Q1 2020 Grade 10 value.Q1 2020 Grade 11 # 1 3.45 4.46 3.56 # 2 3.00 3.88 2.81 # 3 0.45 0.58 0.75 # value.Q2 2020 Grade 8 value.Q2 2020 Grade 9 value.Q2 2020 Grade 10 # 1 2.47 2.96 4.10 # 2 2.14 2.54 3.69 # 3 0.33 0.41 0.41 # value.Q2 2020 Grade 11 value.Q3 2020 Grade 8 value.Q3 2020 Grade 9 # 1 3.44 2.00 2.54 # 2 2.61 1.81 2.26 # 3 0.83 0.19 0.28 # value.Q3 2020 Grade 10 value.Q3 2020 Grade 11 # 1 4.11 2.67 # 2 3.68 2.11 # 3 0.43 0.56

substr

答案 1 :(得分:1)

我认为这样做

library(tidyverse)

wider_data <- dat %>% mutate(new_col = paste(Grade,QuarterYear, sep = " ")) %>%
  select(Type, new_col, value) %>%
  pivot_wider(names_from = new_col, values_from = value)

要手动重新排列列,请使用此

wider_data <- wider_data %>% select(1,2,6,10,14,3,7,11,15,4,8,12,16,5,9,13,17)
相关问题