我试图加快使用巨大矩阵的算法。我将它并行化以便对行进行操作,并将数据矩阵放在共享内存中,这样系统就不会被堵塞。然而,我并没有像我希望的那样顺利地工作,它现在引发了一个关于文件的奇怪错误,我不理解,因为我甚至不打开文件中的文件。
大致了解程序中正在发生的事情,用1000次迭代代表算法中发生的事情。
import multiprocessing
import ctypes
import numpy as np
shared_array_base = multiprocessing.Array(ctypes.c_double, 10*10)
shared_array = np.ctypeslib.as_array(shared_array_base.get_obj())
shared_array = shared_array.reshape(10, 10)
def my_func(i, shared_array):
shared_array[i,:] = i
def pool_init(_shared_array, _constans):
global shared_array, constans
shared_array = _shared_array
constans = _constans
def pool_my_func(i):
my_func(i, shared_array)
if __name__ == '__main__':
for i in np.arange(1000):
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
pool.map(pool_my_func, range(10))
print(shared_array)
这引发了这个错误(我在OSX上):
Traceback (most recent call last):
File "weird.py", line 24, in <module>
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 118, in Pool
context=self.get_context())
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 168, in __init__
self._repopulate_pool()
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 233, in _repopulate_pool
w.start()
File "//anaconda/lib/python3.4/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 267, in _Popen
return Popen(process_obj)
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
self._launch(process_obj)
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 69, in _launch
parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files
我很难过。我甚至不打开这里的文件。我想做的就是以不会阻塞系统内存的方式将shared_array
传递给各个进程,我甚至不需要在并行化过程中修改它,如果这对任何事都有帮助的话
此外,如果它很重要,正确的代码本身引发的确切错误有点不同:
Traceback (most recent call last):
File "tcap.py", line 206, in <module>
File "tcap.py", line 202, in main
File "tcap.py", line 181, in tcap_cluster
File "tcap.py", line 133, in ap_step
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 118, in Pool
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 168, in __init__
File "//anaconda/lib/python3.4/multiprocessing/pool.py", line 233, in _repopulate_pool
File "//anaconda/lib/python3.4/multiprocessing/process.py", line 105, in start
File "//anaconda/lib/python3.4/multiprocessing/context.py", line 267, in _Popen
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
File "//anaconda/lib/python3.4/multiprocessing/popen_fork.py", line 69, in _launch
OSError: [Errno 24] Too many open files
所以是的,我不知道该怎么办。任何帮助,将不胜感激。提前谢谢!
答案 0 :(得分:6)
您正在尝试创建 1000 流程池,这些流程池不会被回收(出于某种原因);这些已经消耗了主进程中用于主进程及其子进程之间通信的管道的所有可用文件描述符。
也许您想要使用:
pool = multiprocessing.Pool(8, pool_init, (shared_array, 4))
for _ in range(1000):
pool.map(pool_my_func, range(10))