使用groupby计算熊猫数据框中的总和

时间:2020-04-01 01:06:43

标签: python pandas cumulative-sum

我有一个数据框,在该数据框下,我打算计算累计和:

df_a = pd.DataFrame({'Location': ['SR01','SR01','SR02','SR01','SR01','SR02'],
                 'User':['101','101','101','102','102','102'],
                 'Year':['2018','2019','2019','2018','2019','2019'],
                 'Month':[12, 1, 2, 12, 1, 2],
                 'Qty':[10, -2, 3, 4, -5, 6]})

我的预期输出如下:

df_a = pd.DataFrame({'Location': ['SR01','SR01','SR02','SR01','SR01','SR02'],
                 'User':['101','101','101','102','102','102'],
                 'Year':['2018','2019','2019','2018','2019','2019'],
                 'Month':[12, 1, 2, 12, 1, 2],
                 'Qty':[10, -2, 3, 4, -5, 6],
                'CumSum': [10, 8, 3, 4, -1, 6]})

但是当我使用df_a.groupby(['Location','User','Year','Month']).sum().groupby(level=1).cumsum()时,我得到的却是:

df_a = pd.DataFrame({'Location': ['SR01','SR01','SR02','SR01','SR01','SR02'],
                 'User':['101','101','101','102','102','102'],
                 'Year':['2018','2019','2019','2018','2019','2019'],
                 'Month':[12, 1, 2, 12, 1, 2],
                 'Qty':[10, 8, 4, -1, 11, 5]})

有人可以解释为什么我的代码无法正常工作并解决此问题吗?

1 个答案:

答案 0 :(得分:1)

您需要

df_a.groupby(['Location','User']).Qty.cumsum()
0    10
1     8
2     3
3     4
4    -1
5     6
Name: Qty, dtype: int64

df_a['cumSum']= df_a.groupby(['Location','User']).Qty.cumsum()