这是我的近似工作流程
import multiprocessing as mp
import pickle
import numpy as np
import psutil
test = [np.random.choice(range(1, 1000), 1000000) for el in range(1,1000)]
step_size = 10**4
for i in range(0,len(test), step_size):
p = mp.Pool(10)
temp_list = test[i:i+step_size]
results = p.map(some_function, temp_list)
gc.collect()
mem = psutil.virtual_memory()
print('Memory available in GB' + str(mem.available/(1024**3)))
with open('file_to_store'+ str(int(i/(step_size))+'.pickle', 'wb') as f:
pickle.dump(results, f)
p.close()
它会生成以下输出
Memory available in GB36.038265228271484
Memory available in GB23.011260986328125
Memory available in GB9.837642669677734
然后错误:
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-9-d17700b466c3> in <module>()
260
261 with open(file_path_1+str(int(i/step_size))+'.pickle', 'wb') as f:
--> 262 pickle.dump(vec_1, f)
263 with open(file_path_2+str(int(i/step_size))+'.pickle', 'wb') as f:
264 pickle.dump(vec_2, f)
MemoryError:
some_function
执行一些小的处理,并且不会创建任何可以挂在内存中的全局变量。
我不明白为什么可用内存量减少以及内存耗尽的原因?