如何在Pandas / Python中将表格和水平数据转换为表格数据

时间:2019-02-18 14:49:54

标签: python pandas

有关在python / pandas中重塑数据的快速问题

有时我必须在excel中使用的报告的组织方式如图1所示(水平和垂直)

Make       Model    Volume Yr. 1    Volume Yr. 2
Gadget 1    Model 1 1254              1549
Gadget 2    Model 2 897               1108
Gadget 3    Model 3 1598              1974
Gadget 4    Model 4 5897              7283
Gadget 5    Model 5 9008             11125
Gadget 6    Model 6 2456              3033
Gadget 7    Model 7 700               865
Gadget 8    Model 8 367               453

我认为最好以表格形式处理信息,如下面的图2所示;

Make      Model    Product  Type    Specification           Date    Volume
Gadget 1    Model 1 Product Type 1  Specification 1 Volume Yr. 1    1254
Gadget 1    Model 1 Product Type 1  Specification 1 Volume Yr. 2    1549
Gadget 1    Model 1 Product Type 1  Specification 1 Volume Yr. 3    1913
Gadget 1    Model 1 Product Type 1  Specification 1 Volume Yr. 4    2362
Gadget 1    Model 1 Product Type 1  Specification 1 Volume Yr. 5    2917
Gadget 2    Model 2 Product Type 2  Specification 2 Volume Yr. 1    897
Gadget 2    Model 2 Product Type 2  Specification 2 Volume Yr. 2    1108
Gadget 2    Model 2 Product Type 2  Specification 2 Volume Yr. 3    1368
Gadget 2    Model 2 Product Type 2  Specification 2 Volume Yr. 4    1690
Gadget 2    Model 2 Product Type 2  Specification 2 Volume Yr. 5    2087
Gadget 3    Model 3 Product Type 3  Specification 3 Volume Yr. 1    1598
Gadget 3    Model 3 Product Type 3  Specification 3 Volume Yr. 2    1974
Gadget 3    Model 3 Product Type 3  Specification 3 Volume Yr. 3    2437
Gadget 3    Model 3 Product Type 3  Specification 3 Volume Yr. 4    3010
Gadget 3    Model 3 Product Type 3  Specification 3 Volume Yr. 5    3717
Gadget 4    Model 4 Product Type 4  Specification 4 Volume Yr. 1    5897
Gadget 4    Model 4 Product Type 4  Specification 4 Volume Yr. 2    7283
Gadget 4    Model 4 Product Type 4  Specification 4 Volume Yr. 3    8994
Gadget 4    Model 4 Product Type 4  Specification 4 Volume Yr. 4    11108
Gadget 4    Model 4 Product Type 4  Specification 4 Volume Yr. 5    13718
Gadget 5    Model 5 Product Type 5  Specification 5 Volume Yr. 1    9008
Gadget 5    Model 5 Product Type 5  Specification 5 Volume Yr. 2    11125
Gadget 5    Model 5 Product Type 5  Specification 5 Volume Yr. 3    13739
Gadget 5    Model 5 Product Type 5  Specification 5 Volume Yr. 4    16968
Gadget 5    Model 5 Product Type 5  Specification 5 Volume Yr. 5    20955
Gadget 6    Model 6 Product Type 6  Specification 6 Volume Yr. 1    2456
Gadget 6    Model 6 Product Type 6  Specification 6 Volume Yr. 2    3033
Gadget 6    Model 6 Product Type 6  Specification 6 Volume Yr. 3    3746
Gadget 6    Model 6 Product Type 6  Specification 6 Volume Yr. 4    4626
Gadget 6    Model 6 Product Type 6  Specification 6 Volume Yr. 5    5713
Gadget 7    Model 7 Product Type 7  Specification 7 Volume Yr. 1    700
Gadget 7    Model 7 Product Type 7  Specification 7 Volume Yr. 2    865
Gadget 7    Model 7 Product Type 7  Specification 7 Volume Yr. 3    1068
Gadget 7    Model 7 Product Type 7  Specification 7 Volume Yr. 4    1319
Gadget 7    Model 7 Product Type 7  Specification 7 Volume Yr. 5    1628
Gadget 8    Model 8 Product Type 8  Specification 8 Volume Yr. 1    367
Gadget 8    Model 8 Product Type 8  Specification 8 Volume Yr. 2    453
Gadget 8    Model 8 Product Type 8  Specification 8 Volume Yr. 3    560
Gadget 8    Model 8 Product Type 8  Specification 8 Volume Yr. 4    691
Gadget 8    Model 8 Product Type 8  Specification 8 Volume Yr. 5    854

Tabular

您能否建议以最佳方式在pandas / python中获取无组织的水平和垂直数据表格?

非常感谢。

1 个答案:

答案 0 :(得分:1)

考虑数据如下:

       Make    Model  Volume  Yr. 1  Volume Yr. 2
0  Gadget 1  Model 1           1254          1549
1  Gadget 2  Model 2            897          1108
2  Gadget 3  Model 3           1598          1974
3  Gadget 4  Model 4           5897          7283
4  Gadget 5  Model 5           9008         11125
5  Gadget 6  Model 6           2456          3033
6  Gadget 7  Model 7            700           865
7  Gadget 8  Model 8            367           453

使用pd.melt()

df_new=df.melt(id_vars=['Make','Model'],var_name='Date',value_name='Value')
print(df_new)

        Make    Model           Date  Value
0   Gadget 1  Model 1  Volume  Yr. 1   1254
1   Gadget 2  Model 2  Volume  Yr. 1    897
2   Gadget 3  Model 3  Volume  Yr. 1   1598
3   Gadget 4  Model 4  Volume  Yr. 1   5897
4   Gadget 5  Model 5  Volume  Yr. 1   9008
5   Gadget 6  Model 6  Volume  Yr. 1   2456
6   Gadget 7  Model 7  Volume  Yr. 1    700
7   Gadget 8  Model 8  Volume  Yr. 1    367
8   Gadget 1  Model 1   Volume Yr. 2   1549
9   Gadget 2  Model 2   Volume Yr. 2   1108
10  Gadget 3  Model 3   Volume Yr. 2   1974
11  Gadget 4  Model 4   Volume Yr. 2   7283
12  Gadget 5  Model 5   Volume Yr. 2  11125
13  Gadget 6  Model 6   Volume Yr. 2   3033
14  Gadget 7  Model 7   Volume Yr. 2    865
15  Gadget 8  Model 8   Volume Yr. 2    453

类似地,您可以使用所有不需要在id_vars下展平的列的所有列表,例如:

df.melt(id_vars=['Make','Model','Product  Type','Specification'],\
                              var_name='Date',value_name='Value')