用Pandas Python分组并排序

时间:2017-11-29 11:48:50

标签: python pandas sorting pandas-groupby

我按功能执行了一组,我想按时间顺序按月排序,我该怎么办?目前,该功能按字母顺序对月份进行排序:

func = {'Predictions':['count','mean','median']}

table1 = df.groupby(['FLAG','MONTH']).agg(func)

表1

         Predictions
                        count        mean      median
FLAG       MONTH                                     
0          Apr          49812  106.458209   75.325309
           Aug          44514   93.718901   74.485782
           Feb          51583   98.921119   74.199794
           Jan          54837  100.381814   74.682187
           Jul          49873  100.621877   73.233328
           Jun          47950  103.688532   74.150171
           Mar          52816  106.094774   75.104832
           May          49404  106.847784   75.844241
           Oct          41828   94.744952   76.178077
           Sep          41074   96.430351   75.335261
1          Apr          83377  285.631679  144.582569
           Aug          66285  217.619038  127.087037
           Feb          79693  310.919925  180.507922
           Jan          64730  290.113451  137.291571
           Jul         105213  298.337893  146.956319
           Jun          90305  312.484185  136.222903
           Mar          97274  308.013477  165.752471
           May          87927  310.162600  142.350688
           Oct          47064  258.213619   85.445310
           Sep          47337  240.361602   84.597842

感谢您的帮助!

2 个答案:

答案 0 :(得分:3)

您可以使用reindex

#rewrite code for remove MultiIndex in columns
table1 = df.groupby(['FLAG','MONTH'])['Predictions'].agg(['count','mean','median'])

months = ['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec']

df = table1.reindex(months, level=1)
print (df)
             count        mean      median
FLAG MONTH                                
0    Jan     54837  100.381814   74.682187
     Feb     51583   98.921119   74.199794
     Mar     52816  106.094774   75.104832
     Apr     49812  106.458209   75.325309
     May     49404  106.847784   75.844241
     Jun     47950  103.688532   74.150171
     Jul     49873  100.621877   73.233328
     Aug     44514   93.718901   74.485782
     Sep     41074   96.430351   75.335261
     Oct     41828   94.744952   76.178077
1    Jan     64730  290.113451  137.291571
     Feb     79693  310.919925  180.507922
     Mar     97274  308.013477  165.752471
     Apr     83377  285.631679  144.582569
     May     87927  310.162600  142.350688
     Jun     90305  312.484185  136.222903
     Jul    105213  298.337893  146.956319
     Aug     66285  217.619038  127.087037
     Sep     47337  240.361602   84.597842
     Oct     47064  258.213619   85.445310

答案 1 :(得分:0)

this问题所述,您可以使用以下代码获取索引映射的月份

import calendar
map = {v: k for k,v in enumerate(calendar.month_abbr)}

,您可以使用索引映射来使用

创建月份索引的新列
#create the new index
df["index"] = df["MONTH"].map(lambda x: map[x])
#groupby the new index
table1 = df.groupby(['FLAG','index']).agg(func).reset_index()
#drop the multi index
table1 = table1.reset_index()
#sort by month
table1.sort_values("index", inplace = True)