我有一张与此类似的桌子
import pandas as pd
data = [['2019-02-01',0 ,5],
['2019-02-01',1, 12],
['2019-02-01',2,18],
['2019-02-01' ,3, 23],
['2019-02-01' ,4 ,20],
['2019-03-01',0 ,12],
['2019-03-01', 1,7],
['2019-03-01' ,2, 6],
['2019-03-01' ,3, 5],
['2019-03-01' ,4, 8]]
df = pd.DataFrame(data, columns = ['Start_Month', 'Bucket','Complete'])
我想要一个新列,其中每个start_Month都将计算complete的移位值的总和。就像第一个值将是2019-02-01的complete groupby start_Month Eg的总和是78,而下一个值即存储区1将是78-5 = 8 = 73(5是存储区0的完整值)相同的start_month的2将是78-5-12 = 61,如下面的带值的一个,但在显示计算的图片中。
df['new_Com']=df.groupby(['Start_Month']).Complete.sum() - df.groupby(['Start_Month']).Complete.shift(1).cumsum().fillna(0).astype(int)
这行不通。
答案 0 :(得分:2)
尝试颠倒顺序,然后cumsum
df['New'] = df.iloc[::-1].groupby('Start_Month').Complete.cumsum()
df
Start_Month Bucket Complete New
0 2019-02-01 0 5 78
1 2019-02-01 1 12 73
2 2019-02-01 2 18 61
3 2019-02-01 3 23 43
4 2019-02-01 4 20 20
5 2019-03-01 0 12 38
6 2019-03-01 1 7 26
7 2019-03-01 2 6 19
8 2019-03-01 3 5 13
9 2019-03-01 4 8 8