Pivot/ Unstack a Pandas Dataframe in Python

时间:2018-08-22 13:54:17

标签: python pandas pivot

I have the following dataframe

                            01/01/2017             02/01/2017
 Productid   ProductName    Sales     Discount     Sales     Discount
 1           abc            100       12           234       23
 2           xyz            156       13           237       13
 3           pqr            300       12           198       18

I need to convert this into the following dataframe.

 Productid   ProductName    Date          Sales      Discount
 1           abc            01/01/2017    100        12
 1           abc            02/01/2017    234        23
 2           xyz            01/01/2017    156        13
 2           xyz            02/01/2017    237        13
 3           pqr            01/01/2017    300        12
 3           pqr            02/01/2017    198        18

How can I do this in Python?

1 个答案:

答案 0 :(得分:1)

难以直接复制多索引。因此,首先按照OP的原始数据帧初始化数据帧。

df = pd.read_clipboard() #reading part of OP's Dataframe
df
    Productid   ProductName Sales   Discount    Sales.1 Discount.1
0           1           abc   100         12        234         23
1           2           xyz   156         13        237         13
2           3           pqr   300         12        198         18

df.columns = ['Productid', 'ProductName', 'Sales', 'Discount', 'Sales', 'Discount']
df.set_index(keys=['Productid','ProductName'],inplace=True)
df
                         Sales  Discount    Sales   Discount
Productid   ProductName             
        1           abc    100        12      234         23
        2           xyz    156        13      237         13
        3           pqr    300        12      198         18

array = [['01/01/2017','01/01/2017','02/01/2017','02/01/2017'],
         ['Sales', 'Discount', 'Sales',  'Discount']]
df.columns = pd.MultiIndex.from_arrays(array) #setting multi-index

假设这是OP的数据框:

df
                         01/01/2017         02/01/2017
                         Sales  Discount    Sales   Discount
Productid   ProductName             
        1           abc    100        12      234         23
        2           xyz    156        13      237         13
        3           pqr    300        12      198         18

使用stacklevel=0参数,然后再次在level=[0,1]reset_index()上使用reset_index()的解决方案。最后使用renameindex列的名称更改为Date

df = df.stack(level=0).reset_index(level=[0,1]).reset_index()
df.rename(columns={'index':'Date'},inplace=True)
df[['Productid', 'ProductName','Date','Sales','Discount']]

    Productid   ProductName       Date  Sales   Discount
0           1           abc 01/01/2017    100         12
1           1           abc 02/01/2017    234         23
2           2           xyz 01/01/2017    156         13
3           2           xyz 02/01/2017    237         13
4           3           pqr 01/01/2017    300         12
5           3           pqr 02/01/2017    198         18