我正在尝试将许多matplotlib数据保存到png磁盘文件,因为savefig()很慢,我尝试使用多进程模块来提高速度。
这是我的代码:(我的环境是Windows XP + python_2.6.1 + Matplotlib_1.2.0 + multiprocessing_0.70a1)
import multiprocessing
from figure_creation_mudule import fig_list
def savefig_worker(fig, img_type, folder_path):
file_name = fig.FM_figname
fig.savefig(folder_path+"\\"+file_name+"."+img_type, format=img_type)
return None
if __name__ == '__main__':
pool = multiprocessing.Pool()
for fig in fig_list:
pool.apply_async(savefig_worker, [fig, 'png', 'D:\\img_folder'])
pool.close()
pool.join()
fig_list
是从其他模块导入的列表,包含matplotlib图形对象。
>>> fig_list
[<matplotlib.figure.Figure object at 0x0AAA1670>, <matplotlib.figure.Figure object at 0x0AD2B210>, <matplotlib.figure.Figure object at 0x0B277FD0>]
当我运行代码时,它遇到了问题:
Exception in thread Thread-2:
Traceback (most recent call last):
File "D:\Python\lib\threading.py", line 522, in __bootstrap_inner
self.run()
File "D:\Python\lib\threading.py", line 477, in run
self.__target(*self.__args, **self.__kwargs)
File "D:\Python\lib\multiprocessing\pool.py", line 225, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
这是什么意思?怎么解决?
答案 0 :(得分:1)
我调查了这一点,事实上,Pool.apply_async()
会在幕后挑选物品。要确认这一点,请在REPL中尝试:
>>> from multiprocessing import Pool
>>> def test(obj):
... print obj
...
>>> class A():
... def __getstate__(self):
... print "pickling"
... return {}
...
>>> pool = Pool()
>>> pool.apply_async(test, [A()])
<multiprocessing.pool.ApplyResult object at 0x10bbe82d0>
pickling
>>> <__main__.A instance at 0x10bbe83b0>
为避免这种情况,您需要使用multiprocessing.Pool
之外的其他内容来完成工作。 multiprocessing.Process
可以运作。但是,你应该注意不要产生太多的进程,否则你会减慢速度,而不是加快速度。
修改:如果您打算使用multiprocessing.Pool
this question/answer应该提供帮助