我正在尝试使用Pandas对分组的数据进行排序 我的代码:
df = pd.read_csv("./data3.txt")
grouped = df.groupby(['cust','year','month'])['price'].count()
print(grouped)
我的数据:
cust,year,month,price
astor,2015,Jan,100
astor,2015,Jan,122
astor,2015,Feb,200
astor,2016,Feb,234
astor,2016,Feb,135
astor,2016,Mar,169
astor,2017,Mar,321
astor,2017,Apr,245
tor,2015,Jan,100
tor,2015,Feb,122
tor,2015,Feb,200
tor,2016,Mar,234
tor,2016,Apr,135
tor,2016,May,169
tor,2017,Mar,321
tor,2017,Apr,245
这是我的结果。
cust year month
astor 2015 Feb 1
Jan 2
2016 Feb 2
Mar 1
2017 Apr 1
Mar 1
tor 2015 Feb 2
Jan 1
2016 Apr 1
Mar 1
May 1
2017 Apr 1
Mar 1
如何获取按月排序的输出?
答案 0 :(得分:1)
将参数sort=False
添加到groupby
:
grouped = df.groupby(['cust','year','month'], sort=False)['price'].count()
print (grouped)
cust year month
astor 2015 Jan 2
Feb 1
2016 Feb 2
Mar 1
2017 Mar 1
Apr 1
tor 2015 Jan 1
Feb 2
2016 Mar 1
Apr 1
May 1
2017 Mar 1
Apr 1
Name: price, dtype: int64
如果不可能,请使用第一个解决方案,将月份转换为日期时间,最后转换回:
df['month'] = pd.to_datetime(df['month'], format='%b')
f = lambda x: x.strftime('%b')
grouped = df.groupby(['cust','year','month'])['price'].count().rename(f, level=2)
print (grouped)
cust year month
astor 2015 Jan 2
Feb 1
2016 Feb 2
Mar 1
2017 Mar 1
Apr 1
tor 2015 Jan 1
Feb 2
2016 Mar 1
Apr 1
May 1
2017 Mar 1
Apr 1
Name: price, dtype: int64