有关在python / pandas中重塑数据的快速问题
有时我必须在excel中使用的报告的组织方式如图1所示(水平和垂直)
Make Model Volume Yr. 1 Volume Yr. 2
Gadget 1 Model 1 1254 1549
Gadget 2 Model 2 897 1108
Gadget 3 Model 3 1598 1974
Gadget 4 Model 4 5897 7283
Gadget 5 Model 5 9008 11125
Gadget 6 Model 6 2456 3033
Gadget 7 Model 7 700 865
Gadget 8 Model 8 367 453
我认为最好以表格形式处理信息,如下面的图2所示;
Make Model Product Type Specification Date Volume
Gadget 1 Model 1 Product Type 1 Specification 1 Volume Yr. 1 1254
Gadget 1 Model 1 Product Type 1 Specification 1 Volume Yr. 2 1549
Gadget 1 Model 1 Product Type 1 Specification 1 Volume Yr. 3 1913
Gadget 1 Model 1 Product Type 1 Specification 1 Volume Yr. 4 2362
Gadget 1 Model 1 Product Type 1 Specification 1 Volume Yr. 5 2917
Gadget 2 Model 2 Product Type 2 Specification 2 Volume Yr. 1 897
Gadget 2 Model 2 Product Type 2 Specification 2 Volume Yr. 2 1108
Gadget 2 Model 2 Product Type 2 Specification 2 Volume Yr. 3 1368
Gadget 2 Model 2 Product Type 2 Specification 2 Volume Yr. 4 1690
Gadget 2 Model 2 Product Type 2 Specification 2 Volume Yr. 5 2087
Gadget 3 Model 3 Product Type 3 Specification 3 Volume Yr. 1 1598
Gadget 3 Model 3 Product Type 3 Specification 3 Volume Yr. 2 1974
Gadget 3 Model 3 Product Type 3 Specification 3 Volume Yr. 3 2437
Gadget 3 Model 3 Product Type 3 Specification 3 Volume Yr. 4 3010
Gadget 3 Model 3 Product Type 3 Specification 3 Volume Yr. 5 3717
Gadget 4 Model 4 Product Type 4 Specification 4 Volume Yr. 1 5897
Gadget 4 Model 4 Product Type 4 Specification 4 Volume Yr. 2 7283
Gadget 4 Model 4 Product Type 4 Specification 4 Volume Yr. 3 8994
Gadget 4 Model 4 Product Type 4 Specification 4 Volume Yr. 4 11108
Gadget 4 Model 4 Product Type 4 Specification 4 Volume Yr. 5 13718
Gadget 5 Model 5 Product Type 5 Specification 5 Volume Yr. 1 9008
Gadget 5 Model 5 Product Type 5 Specification 5 Volume Yr. 2 11125
Gadget 5 Model 5 Product Type 5 Specification 5 Volume Yr. 3 13739
Gadget 5 Model 5 Product Type 5 Specification 5 Volume Yr. 4 16968
Gadget 5 Model 5 Product Type 5 Specification 5 Volume Yr. 5 20955
Gadget 6 Model 6 Product Type 6 Specification 6 Volume Yr. 1 2456
Gadget 6 Model 6 Product Type 6 Specification 6 Volume Yr. 2 3033
Gadget 6 Model 6 Product Type 6 Specification 6 Volume Yr. 3 3746
Gadget 6 Model 6 Product Type 6 Specification 6 Volume Yr. 4 4626
Gadget 6 Model 6 Product Type 6 Specification 6 Volume Yr. 5 5713
Gadget 7 Model 7 Product Type 7 Specification 7 Volume Yr. 1 700
Gadget 7 Model 7 Product Type 7 Specification 7 Volume Yr. 2 865
Gadget 7 Model 7 Product Type 7 Specification 7 Volume Yr. 3 1068
Gadget 7 Model 7 Product Type 7 Specification 7 Volume Yr. 4 1319
Gadget 7 Model 7 Product Type 7 Specification 7 Volume Yr. 5 1628
Gadget 8 Model 8 Product Type 8 Specification 8 Volume Yr. 1 367
Gadget 8 Model 8 Product Type 8 Specification 8 Volume Yr. 2 453
Gadget 8 Model 8 Product Type 8 Specification 8 Volume Yr. 3 560
Gadget 8 Model 8 Product Type 8 Specification 8 Volume Yr. 4 691
Gadget 8 Model 8 Product Type 8 Specification 8 Volume Yr. 5 854
您能否建议以最佳方式在pandas / python中获取无组织的水平和垂直数据表格?
非常感谢。
答案 0 :(得分:1)
考虑数据如下:
Make Model Volume Yr. 1 Volume Yr. 2
0 Gadget 1 Model 1 1254 1549
1 Gadget 2 Model 2 897 1108
2 Gadget 3 Model 3 1598 1974
3 Gadget 4 Model 4 5897 7283
4 Gadget 5 Model 5 9008 11125
5 Gadget 6 Model 6 2456 3033
6 Gadget 7 Model 7 700 865
7 Gadget 8 Model 8 367 453
df_new=df.melt(id_vars=['Make','Model'],var_name='Date',value_name='Value')
print(df_new)
Make Model Date Value
0 Gadget 1 Model 1 Volume Yr. 1 1254
1 Gadget 2 Model 2 Volume Yr. 1 897
2 Gadget 3 Model 3 Volume Yr. 1 1598
3 Gadget 4 Model 4 Volume Yr. 1 5897
4 Gadget 5 Model 5 Volume Yr. 1 9008
5 Gadget 6 Model 6 Volume Yr. 1 2456
6 Gadget 7 Model 7 Volume Yr. 1 700
7 Gadget 8 Model 8 Volume Yr. 1 367
8 Gadget 1 Model 1 Volume Yr. 2 1549
9 Gadget 2 Model 2 Volume Yr. 2 1108
10 Gadget 3 Model 3 Volume Yr. 2 1974
11 Gadget 4 Model 4 Volume Yr. 2 7283
12 Gadget 5 Model 5 Volume Yr. 2 11125
13 Gadget 6 Model 6 Volume Yr. 2 3033
14 Gadget 7 Model 7 Volume Yr. 2 865
15 Gadget 8 Model 8 Volume Yr. 2 453
类似地,您可以使用所有不需要在id_vars
下展平的列的所有列表,例如:
df.melt(id_vars=['Make','Model','Product Type','Specification'],\
var_name='Date',value_name='Value')