我想按另一列的最大顺序对一列求和。我的意思是右边有一个降序列created_at
。我想根据降序对每个last
求和。因此created_at
57和106将加在一起。然后是56和105。等等。任何帮助将不胜感激。
ext_bal
数据帧示例:
time app_vendor_id last source currency created_at
0 2019-06-23 00:00:00+00:00 PTE 0.0 COINCHECK XRP 57
1 2019-06-22 00:00:00+00:00 PTE 0.0 COINCHECK XRP 56
2 2019-06-21 00:00:00+00:00 PTE 0.0 COINCHECK XRP 55
3 2019-06-20 00:00:00+00:00 PTE 0.0 COINCHECK XRP 54
4 2019-06-19 00:00:00+00:00 PTE 0.0 COINCHECK XRP 53
5 2019-06-18 00:00:00+00:00 PTE 0.0 COINCHECK XRP 52
...
18 2019-08-19 00:00:00+00:00 PTE 570100.0 COINBASE XRP 106
19 2019-08-18 00:00:00+00:00 PTE 570100.0 COINBASE XRP 105
20 2019-08-08 00:00:00+00:00 PTE 570100.0 COINBASE XRP 104
21 2019-08-07 00:00:00+00:00 PTE 570000.0 COINBASE XRP 103
22 2019-08-06 00:00:00+00:00 PTE 570050.0 COINBASE XRP 102
23 2019-08-05 00:00:00+00:00 PTE 570020.0 COINBASE XRP 101
24 2019-08-04 00:00:00+00:00 PTE 570020.0 COINBASE XRP 100
25 2019-08-03 00:00:00+00:00 PTE 570080.0 COINBASE XRP 99
所需的输出样本:
app_vendor_id last currency created_at
0 PTE 570100.0 XRP 0
1 PTE 570100.0 XRP 1
我的代码:
ext_bal = ext_bal.groupby(['app_vendor_id', 'created_at' ,'currency']).sum().reset_index()
答案 0 :(得分:0)
让我知道是否可行:
df['created_at_mask'] = df.created_at.sub(df.created_at.shift()).fillna(df.created_at.iloc[0])
df.loc[df['created_at_mask']==-1.0, 'created_at_mask'] = np.nan
df['created_at_mask'] = df['created_at_mask'].ffill()
groups = dict(list(df.groupby(by='created_at_mask')['last']))
pd.concat([e.reset_index(drop=True) for e in groups.values()],1).sum(1)
0 570100.0
1 570100.0
2 570100.0
3 570000.0
4 570050.0
5 570020.0
6 570020.0
7 570080.0
dtype: float64