Question

我通过循环csv文件制作一堆图表，然后基于groupby制作多个图表。代码如下：

import pandas as pd   
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages


frame=pd.read_csv('C:\\')

pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
    allotment,year,month,day = group_name
    if month not in pdf_files:
        pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf') 
    plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
    pdf_files[allotment,month].savefig(plot)
    plt.close(plot)

for key in pdf_files:
    pdf_files[key].close()

print "Done"

但这会返回一个错误，指出打开的文件太多了。我想如果我可以将两个for循环合二为一，这可能会解决这个问题，但我不确定如何做到这一点。

Answer 1

任何原因你不能先组合['allotment', 'month']然后每个循环只是一个pdf文件（可能更好地使用with PdfPages(...) as pdf_file:）

basename = r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32'
for file_name, file in frame.groupby(['Allotment','Month']):
    allotment, month = file_name
    with PdfPages('{}_{}_{}.pdf'.format(basename, allotment, month)) as pdf_file:
        for group_name, group in file.groupby(['Allotment','Month','Year', 'Day']):
            plot = group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
            pdf_file.savefig(plot)
            plt.close(plot)

Answer 2

这会有用吗？

import pandas as pd   
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

frame=pd.read_csv('C:\\')

pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
    allotment,year,month,day = group_name
    if month not in pdf_files:
        pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf') 
    plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
    pdf_files[allotment,month].savefig(plot)
    pdf_files[allotment,month].close()
    plt.close(plot)

print "Done"

基本上只需确保在完成编辑后关闭文件。

结合两个for循环以提高效率

2 个答案: