Groupby结果不会返回分组列,因此在使用countplot

时间:2018-05-29 08:23:03

标签: python dataframe seaborn pandas-groupby

我的数据正确分组。

df_RFQ_by_Salesperson = df[
                          (df['state'].str.contains('Done'))
                          ][['sales_person_name2',
                             'rfq_qty',
                             'rfq_qty_CAD_Equiv',
                             'state'
                            ]].copy()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.groupby('sales_person_name2').agg({'state': 'size','rfq_qty': 'sum', 'rfq_qty_CAD_Equiv': 'sum'})
df_RFQ_by_Salesperson['Percentage'] = df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv / df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv.sum()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.rename(columns={'state':'Done Trades'}, level=0) # rename the column header in the groupby
display(df_RFQ_by_Salesperson.sort_values('Percentage',ascending=False))

sales_person_name2  Done Trades rfq_qty     rfq_qty_CAD_Equiv   Percentage          
MP                       11     214400000.0 3.045802e+08        0.258089
AC                       22     228800000.0 2.648099e+08        0.224390
YJ                       7      202500000.0 2.490527e+08        0.211038
RW                       18     129000000.0 1.693008e+08        0.143459
AY                       171    118366000.0 1.189635e+08        0.100805
RL                       47     78617000.0  7.342725e+07        0.062219

但是当我尝试使用sns.countplot进行可视化时,看起来按列分组不在列列表中,因此会引发错误。

display(df_RFQ_by_Salesperson.columns)

Index(['Done Trades', 'rfq_qty', 'rfq_qty_CAD_Equiv', 'Percentage'], dtype='object')

# # Visualisation 
ax = sns.countplot(
                x='sales_person_name2', 
                data=df_RFQ_by_Salesperson, 
                # Order by the count
                order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,
                color=plot_colour
                 )
for label in ax.xaxis.get_ticklabels():
    label.set_rotation(90)  
plt.show()    

KeyError: 'sales_person_name2'
---> 22   order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,

有没有办法强制python在datarame中包含sales_person_name2?

0 个答案:

没有答案