我有一个包含以下数据的csv文件:
我的输入:
Firm Policy Status
=======================
Firm1 Pol1 Active
Firm1 Pol2 Active
Firm1 Pol3 Inactive
Firm2 Pol4 Active
我的输出:
Firm ActivePolicy InactivePolicy
===============================================
Firm1 2 1
Firm2 1 0
因此,如果我需要将其分解为SQL,那就是:
Select count(*) as ActivePolicy, status from Mytable group by firm having status = 'Active';
Select count(*) as InactivePolicy, status from Mytable group by firm having status = 'Inactive';
现在,我想用panda库在Python中实现它。我已经准备好数据框。我尝试使用lambda函数,但未按预期工作。
这是我尝试的代码:
from pandas import read_excel
my_sheet_name = 'sheet1'
df = read_excel('C:\\Users\\test.xlsx', sheet_name = my_sheet_name)
g = df.groupby('Firm') # GROUP BY Firm
g.filter(lambda x: x['status'] == 'Active').sum()