排列堆积条形图

时间:2018-04-19 15:42:15

标签: python python-3.x pandas dataframe cptbarplot

所有

我有一个数据框,我已对其进行分组和排序,看起来像这样

 brkrcy=data[data['upload_date']==Date].groupby(['CPTY','currency'], as_index=False).agg({"Gross Loan Amount": "sum"})
    brkrcy=brkrcy.sort_values(by=['CPTY', 'Gross Loan Amount'], ascending=[True, False])
    brkrcy = brkrcy.set_index('CPTY')

双重排名

       currency  Gross Loan Amount
CPTY                              
BARC        RUB       2.178780e+07
BARC        ZAR       7.779714e+07
BARC        JPY       1.227676e+09
BARC        EUR       3.301354e+09
BARC        GBP       5.002534e+09
BARC        USD       6.667446e+09
BMON        CAD       2.018614e+08
BMON        GBP       4.096820e+08
BMON        USD       6.510318e+08
BNP         CAD       2.349053e+08
BNP         JPY       1.523716e+09
BNP         GBP       3.234833e+09
BNP         USD       4.576760e+09
BNP         EUR       4.935927e+09
CALIP       EUR       1.832390e+07
CALIP       USD       1.448161e+09
CALIP       GBP       3.492144e+09
CANTR       USD       3.987880e+08
CIBC        CAD       6.851792e+08
CIBC        GBP       8.861776e+08
CITI        CZK       7.549203e+06

brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack().plot(kind="bar",stacked=True,figsize=(10,8))
plt.ylabel('Gross Loan Amount in Billions')
plt.show()

enter image description here 正如您所看到的那样,尽管它是双重排名,但堆积条形图并未按降序排列。我该怎么改变呢?

1 个答案:

答案 0 :(得分:1)

假设您的意思是堆积图上的下降条,请考虑添加一个帮助总计列,该列按每个 CPTY 对绘图数据框的所有货币字段求和。使用此新列按降序对数据进行排序,然后在绘制之前删除辅助列:

plot_df = brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack()
plot_df['Total'] = plot_df.apply('sum', axis=1)          # HELPER COLUMN

plot_df.sort_values('Total', ascending=False)\
       .drop(columns=['Total'])\
       .plot(kind="bar", stacked=True, figsize=(10,8))

plt.ylabel('Gross Loan Amount in Billions')
plt.show()

使用随机数据进行演示,希望能够复制您的实际数据(播种以获得再现性):

数据

import numpy as np
import pandas as pd

np.random.seed(42018)

CPTY = ["BARC", "BMON", "CALIP", "BNP", "CIBC", "CANTR", "CITI"]
currency = ["RUB", "ZAR", "JPY", "EUR", "GBP", "USD", "CAD"]

data = pd.DataFrame({'CPTY': ["".join(np.random.choice(CPTY,1)) for _ in range(50)],
                     'currency': ["".join(np.random.choice(currency,1)) for _ in range(50)],
                     'Gross Loan Amount': abs(np.random.randn(50))*10000000
                    }, columns = ['CPTY','currency','Gross Loan Amount'])

brkrcy = data.groupby(['CPTY','currency'], as_index=False).agg({"Gross Loan Amount": "sum"})\
             .sort_values(by=['CPTY', 'Gross Loan Amount'], ascending=[True, False])\
             .set_index('CPTY')
print(brkrcy.head(10))
#       currency  Gross Loan Amount
# CPTY                             
# BARC       JPY       3.854475e+07
# BARC       RUB       9.201352e+06
# BARC       USD       7.744341e+06
# BMON       EUR       2.780286e+07
# BMON       JPY       2.365747e+07
# BMON       CAD       8.523440e+06
# BNP        RUB       1.268484e+07
# BNP        GBP       8.149266e+06
# BNP        EUR       7.575220e+06
# CALIP      USD       3.387214e+07

<强>剧情

import matplotlib.pyplot as plt

plot_df = brkrcy.set_index('currency',append=True)['Gross Loan Amount'].unstack()
plot_df['Total'] = plot_df.apply('sum', axis=1)

plot_df.sort_values('Total', ascending=False)\
       .drop(columns=['Total'])\
       .plot(kind="bar", stacked=True, figsize=(10,8))

plt.ylabel('Gross Loan Amount in Billions')
plt.show()

Plot Output