我通过循环csv文件制作一堆图表,然后基于groupby制作多个图表。代码如下:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
frame=pd.read_csv('C:\\')
pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
allotment,year,month,day = group_name
if month not in pdf_files:
pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf')
plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
pdf_files[allotment,month].savefig(plot)
plt.close(plot)
for key in pdf_files:
pdf_files[key].close()
print "Done"
但这会返回一个错误,指出打开的文件太多了。我想如果我可以将两个for循环合二为一,这可能会解决这个问题,但我不确定如何做到这一点。
答案 0 :(得分:1)
任何原因你不能先组合['allotment', 'month']
然后每个循环只是一个pdf文件(可能更好地使用with PdfPages(...) as pdf_file:
)
basename = r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32'
for file_name, file in frame.groupby(['Allotment','Month']):
allotment, month = file_name
with PdfPages('{}_{}_{}.pdf'.format(basename, allotment, month)) as pdf_file:
for group_name, group in file.groupby(['Allotment','Month','Year', 'Day']):
plot = group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
pdf_file.savefig(plot)
plt.close(plot)
答案 1 :(得分:0)
这会有用吗?
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
frame=pd.read_csv('C:\\')
pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
allotment,year,month,day = group_name
if month not in pdf_files:
pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf')
plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
pdf_files[allotment,month].savefig(plot)
pdf_files[allotment,month].close()
plt.close(plot)
print "Done"
基本上只需确保在完成编辑后关闭文件。