当遇到numpy运行多处理时,我遇到了一个问题,即Python意外退出。我已经解决了这个问题,现在我可以确认在运行下面的代码时多处理工作是完美的:
import numpy as np
from multiprocessing import Pool, Process
import time
import cPickle as p
def test(args):
x,i = args
if i == 2:
time.sleep(4)
arr = np.dot(x.T,x)
print i
if __name__ == '__main__':
x = np.random.random(size=((2000,500)))
evaluations = [(x,i) for i in range(5)]
p = Pool()
p.map_async(test,evaluations)
p.close()
p.join()
当我尝试评估下面的代码时会出现问题。这使得Python意外退出:
import numpy as np
from multiprocessing import Pool, Process
import time
import cPickle as p
def test(args):
x,i = args
if i == 2:
time.sleep(4)
arr = np.dot(x.T,x)
print i
if __name__ == '__main__':
x = np.random.random(size=((2000,500)))
test((x,4)) # Added code
evaluations = [(x,i) for i in range(5)]
p = Pool()
p.map_async(test,evaluations)
p.close()
p.join()
请帮助别人。我对所有建议持开放态度。谢谢。注意:我尝试过两台不同的机器,同样的问题也出现了。
答案 0 :(得分:6)
这是MacOS X上多处理和numpy的已知问题,并且有点重复:
segfault using numpy's lapack_lite with multiprocessing on osx, not linux
http://mail.scipy.org/pipermail/numpy-discussion/2012-August/063589.html
答案似乎是在连接Numpy时使用除Apple加速框架之外的其他BLAS ...不幸的是:(
答案 1 :(得分:5)
我找到了问题的解决方法。在初始化多处理实例之前,将Numpy与BLAS一起使用时会出现问题。我的解决方法是简单地将Numpy代码(运行BLAS)放入一个进程,然后运行多处理实例。这不是一个好的编码风格,但它的工作原理。见下面的例子:
以下将失败 - Python将退出:
import numpy as np
from multiprocessing import Pool, Process
def test(x):
arr = np.dot(x.T,x) # On large matrices, this calc will use BLAS.
if __name__ == '__main__':
x = np.random.random(size=((2000,500))) # Random matrix
test(x)
evaluations = [x for _ in range(5)]
p = Pool()
p.map_async(test,evaluations) # This is where Python will quit, because of the prior use of BLAS.
p.close()
p.join()
以下将成功:
import numpy as np
from multiprocessing import Pool, Process
def test(x):
arr = np.dot(x.T,x) # On large matrices, this calc will use BLAS.
if __name__ == '__main__':
x = np.random.random(size=((2000,500))) # Random matrix
p = Process(target = test,args = (x,))
p.start()
p.join()
evaluations = [x for _ in range(5)]
p = Pool()
p.map_async(test,evaluations)
p.close()
p.join()