熊猫数据透视表和联合国数据透视表

时间:2019-08-19 14:34:25

标签: python pandas dataframe

给出数据帧df:

df = pd.DataFrame({'Store_ID': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
               'Week_ID':  [1,1,1,1,1,1,1, 2,2,2,2,2,2,2, 3,3,3,3,3,3,3],
               'Day': ['Mo','Tu','We','Th','Fr','Sa','Su','Mo','Tu','We','Th','Fr','Sa','Su','Mo','Tu','We','Th','Fr','Sa','Su'],
               'Manager': ['Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash','Kev','Kev','Nash'],
               'Store_Opener': ['Jev','Jev','Oash','Kev','Kev','Nash','Jev','Jev','Oash','Kev','Kev','Nash','Jev','Jev','Oash','Kev','Kev','Nash','Kev','Kev','Nash']
           })

enter image description here

我想毫无保留地喜欢df1之类的东西。 (而且,我想知道是否可以反向操作或旋转回df)

df1 = pd.DataFrame({'Store_ID': [1,1,1],
                   'Week_ID':  [1,2,3],
                   'Day_Mo_Manager':['Kev','Kev','Nash'],
                   'Day_Tu_Manager':['Kev','Nash','Kev'],
                   'Day_We_Manager':['?','?','?'],
                   'Day_Th_Manager':['?','?','?'],
                   'Day_Fr_Manager':['?','?','?'],
                   'Day_Sa_Manager':['?','?','?'],
                   'Day_Su_Manager':['?','?','?'],                       
                   'Day_Mo_Store_Opener':['Jev','Jev','Oash'],
                   'Day_Tu_Store_Opener':['Jev','Oash','Jev'],
                   'Day_We_Store_Opener':['?','?','?'],
                   'Day_Th_Store_Opener':['?','?','?'],
                   'Day_Fr_Store_Opener':['?','?','?'],
                   'Day_Sa_Store_Opener':['?','?','?'],
                   'Day_Su_Store_Opener':['?','?','?'],

})

enter image description here

是否有某种方法可以旋转表并取消旋转表,如图所示? 受Partial Pivoting In Pandas SQL Or Spark的启发 我尝试过

  

df.set_index(['Store_ID','Week_ID'])['Manager']。unstack()

     

df.pivot_table(index ='Store_ID',columns ='Week_ID',values ='Manager')

但是给出了一些错误。

1 个答案:

答案 0 :(得分:5)

您可以尝试以下方法:

df_out = df.set_index(['Store_ID','Week_ID','Day']).unstack(-1)

df_out.columns = [f'Day_{j}_{i}' for i, j in df_out.columns]

df_out

输出:

                 Day_Fr_Manager Day_Mo_Manager Day_Sa_Manager Day_Su_Manager  \
Store_ID Week_ID                                                               
1        1                  Kev            Kev           Nash            Kev   
         2                 Nash            Kev            Kev            Kev   
         3                  Kev           Nash            Kev           Nash   

                 Day_Th_Manager Day_Tu_Manager Day_We_Manager  \
Store_ID Week_ID                                                
1        1                  Kev            Kev           Nash   
         2                  Kev           Nash            Kev   
         3                 Nash            Kev            Kev   

                 Day_Fr_Store_Opener Day_Mo_Store_Opener Day_Sa_Store_Opener  \
Store_ID Week_ID                                                               
1        1                       Kev                 Jev                Nash   
         2                      Nash                 Jev                 Jev   
         3                       Kev                Oash                 Kev   

                 Day_Su_Store_Opener Day_Th_Store_Opener Day_Tu_Store_Opener  \
Store_ID Week_ID                                                               
1        1                       Jev                 Kev                 Jev   
         2                       Jev                 Kev                Oash   
         3                      Nash                Nash                 Kev   

                 Day_We_Store_Opener  
Store_ID Week_ID                      
1        1                      Oash  
         2                       Kev  
         3                       Kev  

此外,如果您想保留日间订单,请使用pd.Categorical:

df['Day'] = pd.Categorical(df['Day'], df['Day'].unique(), ordered=True)

df_out = df.set_index(['Store_ID','Week_ID','Day']).unstack(-1)

df_out.columns = [f'Day_{j}_{i}' for i, j in df_out.columns]

df_out

输出:

                 Day_Mo_Manager Day_Tu_Manager Day_We_Manager Day_Th_Manager  \
Store_ID Week_ID                                                               
1        1                  Kev            Kev           Nash            Kev   
         2                  Kev           Nash            Kev            Kev   
         3                 Nash            Kev            Kev           Nash   

                 Day_Fr_Manager Day_Sa_Manager Day_Su_Manager  \
Store_ID Week_ID                                                
1        1                  Kev           Nash            Kev   
         2                 Nash            Kev            Kev   
         3                  Kev            Kev           Nash   

                 Day_Mo_Store_Opener Day_Tu_Store_Opener Day_We_Store_Opener  \
Store_ID Week_ID                                                               
1        1                       Jev                 Jev                Oash   
         2                       Jev                Oash                 Kev   
         3                      Oash                 Kev                 Kev   

                 Day_Th_Store_Opener Day_Fr_Store_Opener Day_Sa_Store_Opener  \
Store_ID Week_ID                                                               
1        1                       Kev                 Kev                Nash   
         2                       Kev                Nash                 Jev   
         3                      Nash                 Kev                 Kev   

                 Day_Su_Store_Opener  
Store_ID Week_ID                      
1        1                       Jev  
         2                       Jev  
         3                      Nash  

并重新调整为原始形状。

#Use str accessor and slicing to strip 'Day_' from columns then split on first '_'.  
#Unzip and use from_arrays to re-create MultiIndex.
df_out.columns = pd.MultiIndex.from_arrays((zip(*df_out.columns.str[4:].str.split('_',1))))

#Stack level=0 of MultiIndex column header into the dataframe index
df_out.stack(0).reset_index()

输出:

    Store_ID  Week_ID level_2 Manager Store_Opener
0          1        1      Fr     Kev          Kev
1          1        1      Mo     Kev          Jev
2          1        1      Sa    Nash         Nash
3          1        1      Su     Kev          Jev
4          1        1      Th     Kev          Kev
5          1        1      Tu     Kev          Jev
6          1        1      We    Nash         Oash
7          1        2      Fr    Nash         Nash
8          1        2      Mo     Kev          Jev
9          1        2      Sa     Kev          Jev
10         1        2      Su     Kev          Jev
11         1        2      Th     Kev          Kev
12         1        2      Tu    Nash         Oash
13         1        2      We     Kev          Kev
14         1        3      Fr     Kev          Kev
15         1        3      Mo    Nash         Oash
16         1        3      Sa     Kev          Kev
17         1        3      Su    Nash         Nash
18         1        3      Th    Nash         Nash
19         1        3      Tu     Kev          Kev
20         1        3      We     Kev          Kev