我尝试对熊猫使用groupby
,但对python来说还很陌生,我似乎找不到解决方法
raw_data = {'Products': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'],
'Month': ['201903', '201903', '201902', '201901', '201902', '201901', '201902', '201904','201903', '201902', '201904', '201903'],
'Sales': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3]}
df = pd.DataFrame(raw_data, columns = ['Products', 'Month', 'Sales'])
df
数据看起来像这样
Products Month Sales
0 A 201903 4
1 A 201903 24
2 A 201902 31
3 A 201901 2
4 B 201902 3
5 B 201901 4
6 B 201902 24
7 C 201904 31
8 C 201903 2
9 C 201902 3
10 C 201904 2
11 C 201903 3
我需要每个产品显示最近两个月的销售总额,如上述
Products Months Sales
A 201902 31
A 201903 28
B 201901 4
B 201902 27
C 201903 5
C 201904 33
很抱歉,如果所有内容的格式都不正确,那么对于
谢谢
答案 0 :(得分:1)
这可以做到:
(df.groupby(['Products', 'Month'], as_index=False)
.sum()
.sort_values(['Products', 'Sales'],
ascending=(True,False))
.groupby('Products')
.head(2))
Products Month Sales
1 A 201902 31
2 A 201903 28
4 B 201902 27
3 B 201901 4
7 C 201904 33
6 C 201903 5