我的df
df_RFQ_by_Salesperson = df[
(df['state'].str.contains('Done'))
][['sales_person_name2',
'rfq_qty',
'rfq_qty_CAD_Equiv',
'state'
]].copy()
display(df_RFQ_by_Salesperson.head(3))
sales_person_name2 rfq_qty rfq_qty_CAD_Equiv state
14 AY 200000.0 2.568713e+05 Done
22 AY 1000000.0 1.284357e+06 Done
28 YJJ 25000000.0 4.420085e+07 Done
我想groupby
上的df_RFQ_by_Salesperson
,sum
上的rfq_qty
,sum
上的rfq_qty_CAD_Equiv
,count
上的state
{ {1}}然后根据rfq_qty_CAD_Equiv
添加百分比列。我已经计算出总和和百分比列,但我不确定如何循环计数状态?
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.rename(columns={'state':'Done Trades'}, level=0) # rename the column header in the groupby
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.groupby(['sales_person_name2'])['rfq_qty','rfq_qty_CAD_Equiv'].sum()
Total_Done_Volume = df_RFQ_by_Salesperson['rfq_qty_CAD_Equiv'].sum()
df_RFQ_by_Salesperson['Percentage'] = df_RFQ_by_Salesperson['rfq_qty_CAD_Equiv'].div(Total_Done_Volume)
display(df_RFQ_by_Salesperson.sort_values('Percentage',ascending=False))
sales_person_name2 rfq_qty rfq_qty_CAD_Equiv Percentage Count of State
MP 214400000.0 3.045802e+08 0.258089 ?
AC 228800000.0 2.648099e+08 0.224390 ?
YJJ 202500000.0 2.490527e+08 0.211038 ?
RW 129000000.0 1.693008e+08 0.143459 ?
AY 118366000.0 1.189635e+08 0.100805 ?
RL 78617000.0 7.342725e+07 0.062219 ?
是否可以与一组中的总和一起进行计数?
答案 0 :(得分:1)
您可以通过指定从列名到功能的映射来聚合具有不同功能的多个列:
out = df.groupby('sales_person_name2').agg(
{'rfq_qty': 'sum', 'rfq_qty_CAD_Equiv': 'sum', 'state': 'size'}
)
然后单独计算百分比并分配到百分比列
out['percentage'] = out.rfq_qty_CAD_Equiv / out.rfq_qty_CAD_Equiv.sum()