Question

我希望pandas等同于Excel的sumifs例如

=SUMIFS($D4:D$107,$D$107,$G4:G$107)

我有三列，contract，amount和transaction_type_tla。对于每个contract，如果交易类型为amount，我想将CBP求和。以下公式无效：

data['Var']=(data.groupby('contract',"transaction_type_tla=='CBP'")['amount'].cumsum())

Answer 1

借用jp'data :-)

df['New']=df.groupby('contract').apply(lambda x : x['amount'][x['type']=='CBP'].cumsum()).reset_index(level=0,drop=True)
df
Out[258]: 
  contract  amount type    New
0        A     123  ABC    NaN
1        A     341  ABC    NaN
2        A     652  CBP  652.0
3        A     150  CBP  802.0
4        B     562  DEF    NaN
5        B     674  ABC    NaN
6        B     562  CBP  562.0
7        B     147  CBP  709.0

Answer 2

编辑：我认为@Wen的回答更符合您的要求，但如果您希望将结果作为一个系列：

一种简单的方法是首先按照您要查找的transaction_type_tla过滤交易列表，然后应用groupby和您想要的任何聚合方法：

ans = data[data['transaction_type_tla'] == 'CBP']
ans.groupby('contract')['amount'].cumsum()

这会产生一系列答案。

Answer 3

这是一种方式。我已经设置了一些想要测试的虚数据。

输出是相同格式的数据框，但总计CBP个事务。

import pandas as pd

df = pd.DataFrame([['A', 123, 'ABC'],
                   ['A', 341, 'ABC'],
                   ['A', 652, 'CBP'],
                   ['A', 150, 'CBP'],
                   ['B', 562, 'DEF'],
                   ['B', 674, 'ABC'],
                   ['B', 562, 'CBP'],
                   ['B', 147, 'CBP']],
                  columns=['contract', 'amount', 'type'])

s = df.groupby(['contract', 'type'])['amount'].sum()
df = df.set_index(['contract', 'type']).join(s, rsuffix='_group')

df.loc[pd.IndexSlice[:, 'CBP'], 'amount'] = df.loc[pd.IndexSlice[:, 'CBP'], 'amount_group']
df = df.drop('amount_group', 1).reset_index().drop_duplicates()

#   contract type  amount
# 0        A  ABC     123
# 1        A  ABC     341
# 2        A  CBP     802
# 4        B  ABC     674
# 5        B  CBP     709
# 7        B  DEF     562

在两个条件下在熊猫的Sumifs

3 个答案: