基于groupby的绘图

时间:2016-06-14 21:26:56

标签: python pandas matplotlib

我有这样的df:

Year    Grass       Crop        Forest      Ecoregion   CRP
1993    30.41268857 68.45446692 0.255632102 46e         0508common
2001    47.29988968 47.68577796 0.509939614 46e         0508common
2006    71.37357063 20.40485399 0.908684114 46e         0508common
1993    27.17246635 71.97582809 0.12611897  46k         0508common
2001    65.74087991 30.61323084 0.1229253   46k         0508common
2006    81.763099   12.4386173  0.180860941 46k         0508common
1993    30.83567893 68.14034747 0.737649228 46e         05f08
2001    59.45355722 35.68378142 0.354265748 46e         05f08
2006    64.98592643 28.61787829 0.339706881 46e         05f08
1993    28.38187702 71.40776699 0.080906149 46k         05f08
2001    81.90938511 15.4368932  0.118662352 46k         05f08
2006    86.3214671  9.207119741 0.172599784 46k         05f08
1993    18.46387279 80.77686402 0.270081631 46e         05f97
2001    41.23923454 53.1703113  0.605111585 46e         05f97
2006    65.30004066 25.45626696 0.989918731 46e         05f97
1993    20.34764075 78.68863002 0.218653535 46k         05f97
2001    55.42761042 39.96085063 0.191151874 46k         05f97
2006    76.34526161 16.53176535 0.246221691 46k         05f97

我想在Ecoregion上创建基于groupby的图表。然后在每个Ecoregion内我想根据唯一CRP绘制图表。因此每个唯一Ecoregion将获得自己的pdf文件,然后在该文件中将是基于CRP的图表。在这种情况下,Ecoregion 46e会有三个图表(0508common05f0805f97),而Ecoregion 46k也会有三张图。

我正在尝试以下代码:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import os

df=pd.read_csv(r'C:\pathway_to_file.csv')
group=df.groupby(['Ecoregion'])
pdf_files = {}
out=r'C:\output_location'
for ecoregion, outer_group  in df.groupby(['Ecoregion']):
    with PdfPages(os.path.join(out,ecoregion + '.pdf')) as pdf:
                for crp, inner_group in outer_group.groupby(['CRP']):
                    title=crp + '_' + ecoregion
                    lu_colors=(['g','y','b','r', 'k'])
                    plot=group.plot(x=['Year'], y=['Grass', 'Crop', 'Forest'],kind='bar', colors=lu_colors, title=title).get_figure()
                    plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
                    plt.xticks(rotation=70)
                    plt.set_xlabel('Year')
                    plt.set_ylabel('Percent')
                    pdf.savefig(plot)  
                    plt.close(plot)

但这不能正常工作,图表甚至不是我想要的条形图。

如何让一个单独的图形像我想要的那样是一个例子,但这并不像我想要的那样使用groupby:

with PdfPages(r'G:\graphs.pdf') as pdf: 
        lu_colors=(['g','y','b','r', 'k'])
        ax=df.set_index('Year').plot(title='0508common_46e', kind='bar', colors=lu_colors)
        plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
        plt.xticks(rotation=70)
        ax.set_xlabel('Year')
        ax.set_ylabel('Percent')
        fig=plt.gcf()
        pdf.savefig(fig)
        plt.close(fig) 

在这种情况下,df将是:

    Year    Grass       Crop        Forest      Ecoregion   CRP
    1993    30.41268857 68.45446692 0.255632102 46e         0508common
    2001    47.29988968 47.68577796 0.509939614 46e         0508common
    2006    71.37357063 20.40485399 0.908684114 46e         0508common

1 个答案:

答案 0 :(得分:1)

你在情节上犯了一个错误。你必须绘制内部组(igr)而不是外部。我稍微改变了你的代码以便更顺利:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import os

lu_colors=(['g','y','b','r','k'])

df = pd.read_csv('1.csv', header=0, usecols = [0,1,2,3,4,5])
for ecor, ogr  in df.groupby(['Ecoregion']):
    with PdfPages(os.path.join("./pdf", ecor.strip()+'.pdf')) as pdf:
        for crp, igr in ogr.groupby(['CRP']):
            title = crp.strip() + '_' + ecor.strip()
            plot = igr.plot(x=['Year'], y=['Grass', 'Crop', 'Forest'], kind='bar', colors=lu_colors, title=title).get_figure()
            plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
            plt.xticks(rotation=70)
            ax = plt.gca()
            ax.set_xlabel('Year')
            ax.set_ylabel('Percent')
            pdf.savefig(plot, bbox_inches='tight', pad_inches=0)  
            plt.close(plot)

结果之一: enter image description here