枢轴按时间值排序-熊猫

时间:2019-09-24 06:03:31

标签: pandas pivot

我想pivotdf并显示基于时间值的值,而不是列值。

df = pd.DataFrame({
    'Place' : ['John','Alan','Cory','Jim','John','Alan','Cory','Jim'],                                
    'Number' : ['2','3','5','5','3','4','6','6'],          
    'Code' : ['1','2','3','4','1','2','3','4'],                      
    'Time' : ['1904-01-01 08:00:00','1904-01-01 09:00:00','1904-01-02 01:00:00','1904-01-02 02:00:00','1904-01-01 08:10:00','1904-01-01 09:10:00','1904-01-02 01:10:00','1904-01-02 02:10:00'],                           
    })

df = df.pivot_table(index = 'Number', columns = 'Place', values = 'Time', aggfunc = 'first').fillna('')

出局:

Place                  Alan                 Cory                  Jim                 John
Number                                                                                    
2                                                                      1904-01-01 08:00:00
3       1904-01-01 09:00:00                                            1904-01-01 08:10:00
4       1904-01-01 09:10:00                                                               
5                            1904-01-02 01:00:00  1904-01-02 02:00:00                     
6                            1904-01-02 01:10:00  1904-01-02 02:10:00 

预期输出:

Place                  John                 Alan                 Cory                  Jim
Number                                                                                    
2       1904-01-01 08:00:00                                                               
3       1904-01-01 08:10:00  1904-01-01 09:00:00                                          
4                            1904-01-01 09:10:00                                          
5                                                 1904-01-02 01:00:00  1904-01-02 02:00:00
6                                                 1904-01-02 01:10:00  1904-01-02 02:10:00             

注意:我仅添加了一个虚拟日期来区分午夜之后的时间。最终,我将删除日期,并在df进行适当排序后仅保留时间。

1 个答案:

答案 0 :(得分:1)

不幸的是,pivot_table默认情况下对列名进行排序,没有用于避免这种情况的参数。因此,可能的解决方案是通过Place列的原始唯一值DataFrame.reindex

#if necessary convert to datetimes and sorting
df['Time'] = pd.to_datetime(df['Time'])
df = df.sort_values('Time')
df1 = df.pivot_table(index='Number',columns='Place',values='Time',aggfunc='first').fillna('')

df1 = df1.reindex(columns=df['Place'].unique())
print (df1)
Place                  John                 Alan                 Cory  \
Number                                                                  
2       1904-01-01 08:00:00                                             
3       1904-01-01 08:10:00  1904-01-01 09:00:00                        
4                            1904-01-01 09:10:00                        
5                                                 1904-01-02 01:00:00   
6                                                 1904-01-02 01:10:00   

Place                   Jim  
Number                       
2                            
3                            
4                            
5       1904-01-02 02:00:00  
6       1904-01-02 02:10:00