如何在Python中多处理保存图文件?

时间:2013-03-05 13:10:34

标签: python matplotlib multiprocessing

我正在尝试将许多matplotlib数据保存到png磁盘文件,因为savefig()很慢,我尝试使用多进程模块来提高速度。

这是我的代码:(我的环境是Windows XP + python_2.6.1 + Matplotlib_1.2.0 + multiprocessing_0.70a1)

import multiprocessing
from figure_creation_mudule import fig_list

def savefig_worker(fig, img_type, folder_path):
    file_name = fig.FM_figname 
    fig.savefig(folder_path+"\\"+file_name+"."+img_type, format=img_type)
    return None

if __name__ == '__main__':
    pool = multiprocessing.Pool()
    for fig in fig_list:
        pool.apply_async(savefig_worker, [fig, 'png', 'D:\\img_folder'])
    pool.close()
    pool.join()

fig_list是从其他模块导入的列表,包含matplotlib图形对象。

>>> fig_list
[<matplotlib.figure.Figure object at 0x0AAA1670>, <matplotlib.figure.Figure object at 0x0AD2B210>, <matplotlib.figure.Figure object at 0x0B277FD0>]

当我运行代码时,它遇到了问题:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "D:\Python\lib\threading.py", line 522, in __bootstrap_inner
    self.run()
  File "D:\Python\lib\threading.py", line 477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "D:\Python\lib\multiprocessing\pool.py", line 225, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

这是什么意思?怎么解决?

1 个答案:

答案 0 :(得分:1)

我调查了这一点,事实上,Pool.apply_async()会在幕后挑选物品。要确认这一点,请在REPL中尝试:

>>> from multiprocessing import Pool
>>> def test(obj):
...   print obj
... 
>>> class A():
...   def __getstate__(self):
...     print "pickling"
...     return {}
... 
>>> pool = Pool()
>>> pool.apply_async(test, [A()])
<multiprocessing.pool.ApplyResult object at 0x10bbe82d0>
pickling

>>> <__main__.A instance at 0x10bbe83b0>

为避免这种情况,您需要使用multiprocessing.Pool之外的其他内容来完成工作。 multiprocessing.Process可以运作。但是,你应该注意不要产生太多的进程,否则你会减慢速度,而不是加快速度。

修改:如果您打算使用multiprocessing.Pool this question/answer应该提供帮助