当我尝试运行以下问题时遇到内存问题。
考虑一个针对每个参数arg_i_j的函数,该函数将pandas数据帧返回为
def some_fun(arg_i_j):
...
return DF_i_j
现在,我以以下格式构造了要测试的所有参数,
All_lists = [ [arg_0_0,..., arg_0_N], ..., [arg_k_0,..., arg_k_N]],
我正在尝试在主函数中执行以下代码
# Version A
results_per_list = []
the_pool = multiprocessing.Pool(processes=mp.cpu_count(), initializer=...,initargs=...)
for a_list in All_lists:
results = the_pool.map(some_fun, a_list)
results_per_list.append(results)
the_pool.close()
the_pool.join()
# then use results_per_list to do operations
最后我得到了错误,
...\multiprocessing\connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
MemoryError
1)任何人都知道如何解决该问题?
2)在为“ All_lists”中的每个“ a_list”创建一个“ pool”对象时,您是否看到任何问题,如下所示?
# Version B
results_per_list = []
for a_list in All_lists:
the_pool = multiprocessing.Pool(processes=mp.cpu_count(), initializer=...,initargs=...)
results = the_pool.map(some_fun, a_list)
results_per_list.append(results)
the_pool.close()
the_pool.join()