我想根据条件采用计数比率,并且我正努力使用pandas
数据框来正确计算。
数据如下:
JOB_ROLE COMMENTS ACTIVITY_TYPE COUNTS
Director-Level Meeting Requested EmailSend 490
Manager-Level Meeting Requested EmailSend 305
Non-Managerial Meeting Requested EmailSend 272
Top Executive; C-Level Meeting Requested EmailSend 226
VP-Level Meeting Requested EmailSend 185
Director-Level Meeting Requested FormSubmit 131
Manager-Level Meeting Requested FormSubmit 74
Top Executive; C-Level Meeting Requested FormSubmit 61
VP-Level Meeting Requested FormSubmit 53
Non-Managerial Meeting Requested FormSubmit 52
Other Meeting Requested EmailSend 20
Other Meeting Requested FormSubmit 2
我的尝试如下:
ratios = mr_jr.groupby('JOB_ROLE').apply(lambda x: x[x['ACTIVITY_TYPE']=='FormSubmit'].COUNTS / x[x['ACTIVITY_TYPE']=='EmailSend'].COUNTS)
将条件应用于每个组并执行算术的正确方法是什么?
提前多多感谢。
EDITED
期望的输出:
print(list(ratios)) # prints: [0.26, 0.24, 0.19, 0.27, 0.28, 0.1]
答案 0 :(得分:2)
看起来像数据透视表的作业。
piv = df.pivot('JOB_ROLE', 'ACTIVITY_TYPE').COUNTS
输出:
In [119]: piv.FormSubmit / piv.EmailSend
Out[119]:
JOB_ROLE
Director-Level 0.267347
Manager-Level 0.242623
Non-Managerial 0.191176
Other 0.100000
Top Executive; C-Level 0.269912
VP-Level 0.286486
dtype: float64
没有支点:
df.set_index('JOB_ROLE', drop=True, inplace=True)
emails = df[df.ACTIVITY_TYPE == 'EmailSend']
forms = df[df.ACTIVITY_TYPE == 'FormSubmit']
print(forms.COUNTS / emails.COUNTS)