在熊猫中对分组数据进行排序

时间:2020-09-22 09:25:51

标签: python pandas

我正在尝试使用Pandas对分组的数据进行排序 我的代码:

 df = pd.read_csv("./data3.txt")
 grouped = df.groupby(['cust','year','month'])['price'].count()
 print(grouped)

我的数据:

cust,year,month,price
astor,2015,Jan,100
astor,2015,Jan,122
astor,2015,Feb,200
astor,2016,Feb,234
astor,2016,Feb,135
astor,2016,Mar,169
astor,2017,Mar,321
astor,2017,Apr,245
tor,2015,Jan,100
tor,2015,Feb,122
tor,2015,Feb,200
tor,2016,Mar,234
tor,2016,Apr,135
tor,2016,May,169
tor,2017,Mar,321
tor,2017,Apr,245

这是我的结果。

 cust  year  month
    astor  2015  Feb      1
                 Jan      2
           2016  Feb      2
                 Mar      1
           2017  Apr      1
                 Mar      1
    tor    2015  Feb      2
                 Jan      1
           2016  Apr      1
                 Mar      1
                 May      1
           2017  Apr      1
                 Mar      1

如何获取按月排序的输出?

1 个答案:

答案 0 :(得分:1)

将参数sort=False添加到groupby

grouped = df.groupby(['cust','year','month'], sort=False)['price'].count()
print (grouped)
cust   year  month
astor  2015  Jan      2
             Feb      1
       2016  Feb      2
             Mar      1
       2017  Mar      1
             Apr      1
tor    2015  Jan      1
             Feb      2
       2016  Mar      1
             Apr      1
             May      1
       2017  Mar      1
             Apr      1
Name: price, dtype: int64

如果不可能,请使用第一个解决方案,将月份转换为日期时间,最后转换回:

df['month'] = pd.to_datetime(df['month'], format='%b')
f = lambda x: x.strftime('%b')
grouped = df.groupby(['cust','year','month'])['price'].count().rename(f, level=2)
print (grouped)
cust   year  month
astor  2015  Jan      2
             Feb      1
       2016  Feb      2
             Mar      1
       2017  Mar      1
             Apr      1
tor    2015  Jan      1
             Feb      2
       2016  Mar      1
             Apr      1
             May      1
       2017  Mar      1
             Apr      1
Name: price, dtype: int64