按指定列的最大顺序求和

时间:2019-08-29 02:05:22

标签: python pandas influxdb

我想按另一列的最大顺序对一列求和。我的意思是右边有一个降序列created_at。我想根据降序对每个last求和。因此created_at 57和106将加在一起。然后是56和105。等等。任何帮助将不胜感激。

ext_bal数据帧示例:

            time             app_vendor_id      last     source currency  created_at
0  2019-06-23 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          57
1  2019-06-22 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          56
2  2019-06-21 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          55
3  2019-06-20 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          54
4  2019-06-19 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          53
5  2019-06-18 00:00:00+00:00           PTE       0.0  COINCHECK      XRP          52
...
18 2019-08-19 00:00:00+00:00           PTE  570100.0   COINBASE      XRP         106
19 2019-08-18 00:00:00+00:00           PTE  570100.0   COINBASE      XRP         105
20 2019-08-08 00:00:00+00:00           PTE  570100.0   COINBASE      XRP         104
21 2019-08-07 00:00:00+00:00           PTE  570000.0   COINBASE      XRP         103
22 2019-08-06 00:00:00+00:00           PTE  570050.0   COINBASE      XRP         102
23 2019-08-05 00:00:00+00:00           PTE  570020.0   COINBASE      XRP         101
24 2019-08-04 00:00:00+00:00           PTE  570020.0   COINBASE      XRP         100
25 2019-08-03 00:00:00+00:00           PTE  570080.0   COINBASE      XRP          99

所需的输出样本:

   app_vendor_id      last      currency  created_at
0            PTE      570100.0      XRP          0
1            PTE      570100.0      XRP          1

我的代码:

ext_bal = ext_bal.groupby(['app_vendor_id', 'created_at' ,'currency']).sum().reset_index()

1 个答案:

答案 0 :(得分:0)

让我知道是否可行:

df['created_at_mask'] = df.created_at.sub(df.created_at.shift()).fillna(df.created_at.iloc[0])

df.loc[df['created_at_mask']==-1.0, 'created_at_mask'] = np.nan

df['created_at_mask'] = df['created_at_mask'].ffill()

groups = dict(list(df.groupby(by='created_at_mask')['last']))

pd.concat([e.reset_index(drop=True) for e in groups.values()],1).sum(1)

0    570100.0
1    570100.0
2    570100.0
3    570000.0
4    570050.0
5    570020.0
6    570020.0
7    570080.0
dtype: float64