我有一个由3列组成的数据框(类别名称,月份和已售单元的总和)。
我想重新格式化我的数据框,其中类别名称和单位总和是我的行,每一列代表我指定的顺序的12个月(从Oct开始,以Sep结束)。
我该怎么做?我当前的df的结构如下:
`Category Name` Month sum
<fct> <fct> <dbl>
1 Diet Soda Oct 34680
2 Diet Soda Nov 41589
3 Diet Soda Dec 31564
4 Diet Soda Jan 22635
5 Diet Soda Feb 34853
6 Diet Soda Mar 48583
7 Diet Soda Apr 33550
8 Diet Soda May 44991
9 Diet Soda Jun 34995
10 Diet Soda Jul 33260
11 Diet Soda Aug 46027
12 Diet Soda Sep 33924
13 Diet Soda Can Oct 0
14 Diet Soda Can Nov 1
15 Diet Soda Can Dec 0
16 Diet Soda Can Jan 0
17 Diet Soda Can Feb 0
18 Diet Soda Can Mar 0
19 Diet Soda Can Apr 0
20 Diet Soda Can May 0
答案 0 :(得分:1)
按组创建序列列后,一个选项为pivot_wider
library(dplyr)
library(tidyr)
df1 %>%
group_by(Month, `Category Name`) %>%
mutate(rn = row_number()) %>%
pivot_wider(names_from = Month, values_from = sum)
注意:group_by/mutate
并不是此数据中真正需要的,但在一般情况下
pivot_wider
将数据从“长”格式重塑为“宽”格式
df1 %>%
pivot_wider(names_from = Month, values_from = sum)
# A tibble: 2 x 13
# `Category Name` Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep
# <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#1 Diet Soda 34680 41589 31564 22635 34853 48583 33550 44991 34995 33260 46027 33924
#2 Diet Soda Can 0 1 0 0 0 0 0 0 NA NA NA NA
df1 <- structure(list(`Category Name` = c("Diet Soda", "Diet Soda",
"Diet Soda", "Diet Soda", "Diet Soda", "Diet Soda", "Diet Soda",
"Diet Soda", "Diet Soda", "Diet Soda", "Diet Soda", "Diet Soda",
"Diet Soda Can", "Diet Soda Can", "Diet Soda Can", "Diet Soda Can",
"Diet Soda Can", "Diet Soda Can", "Diet Soda Can", "Diet Soda Can"
), Month = c("Oct", "Nov", "Dec", "Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", "Jan",
"Feb", "Mar", "Apr", "May"), sum = c(34680L, 41589L, 31564L,
22635L, 34853L, 48583L, 33550L, 44991L, 34995L, 33260L, 46027L,
33924L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"))