多处理模块的死锁

时间:2015-11-20 11:01:37

标签: python deadlock python-multiprocessing

我有一个函数,没有多处理循环在具有3元组的数组上并进行一些计算。这个数组可能非常长(> 1百万条)因此我认为使用多个进程可以帮助加快速度。

我从一个点列表(random_points)开始,用它创建所有可能的三元组(combList)的排列。然后将此combList传递给我的函数。 我的基本代码有效但仅当random_points列表有18个或更少的条目时才会有效。

from scipy import stats
import itertools
import multiprocessing as mp

def calc3PointsList( points,output ):
  xy = []
  r = []
  for point in points:
    // do stuff with points and append results to xy and r
  output.put( (xy, r) )


output = mp.Queue()

random_points = [ (np.array((stats.uniform(-0.5,1).rvs(),stats.uniform(-0.5,1).rvs()))) for _ in range(18)]
combList = list(itertools.combinations(random_points, 3))
N = 6
processes = [mp.Process(target=calc3PointsList, args=(combList[(i-1)*len(combList)/(N-1):i*len(combList)/(N-1)],output)) for i in range(1,N)]

for p in processes:
  p.start()
for p in processes:
  p.join()
results = [output.get() for p in processes]

一旦random_points列表的长度超过18,程序就会陷入僵局。 18和更低,它只是完成罚款。我是否以错误的方式使用整个多处理模块?

1 个答案:

答案 0 :(得分:1)

我确实看到你发布的其他任何明显错误但你应该做的事情:在if __name__=="main":块中启动新流程,请参阅programming guideline

from scipy import stats
import itertools
import multiprocessing as mp

def calc3PointsList( points,output ):
  xy = []
  r = []
  for point in points:
    // do stuff with points and append results to xy and r
  output.put( (xy, r) )

if __name__ == "__main__":
    output = mp.Queue()
    random_points = [ (np.array((stats.uniform(-0.5,1).rvs(),stats.uniform(-0.5,1).rvs()))) for _ in range(18)]
    combList = list(itertools.combinations(random_points, 3))
    N = 6
    processes = [mp.Process(target=calc3PointsList, args=(combList[(i-1)*len(combList)/(N-1):i*len(combList)/(N-1)],output)) for i in range(1,N)]

    for p in processes:
        p.start()
    for p in processes:
        p.join()
    results = [output.get for x in range(output.qsize())]