我有一个函数,没有多处理循环在具有3元组的数组上并进行一些计算。这个数组可能非常长(> 1百万条)因此我认为使用多个进程可以帮助加快速度。
我从一个点列表(random_points
)开始,用它创建所有可能的三元组(combList
)的排列。然后将此combList
传递给我的函数。
我的基本代码有效但仅当random_points
列表有18个或更少的条目时才会有效。
from scipy import stats
import itertools
import multiprocessing as mp
def calc3PointsList( points,output ):
xy = []
r = []
for point in points:
// do stuff with points and append results to xy and r
output.put( (xy, r) )
output = mp.Queue()
random_points = [ (np.array((stats.uniform(-0.5,1).rvs(),stats.uniform(-0.5,1).rvs()))) for _ in range(18)]
combList = list(itertools.combinations(random_points, 3))
N = 6
processes = [mp.Process(target=calc3PointsList, args=(combList[(i-1)*len(combList)/(N-1):i*len(combList)/(N-1)],output)) for i in range(1,N)]
for p in processes:
p.start()
for p in processes:
p.join()
results = [output.get() for p in processes]
一旦random_points列表的长度超过18,程序就会陷入僵局。 18和更低,它只是完成罚款。我是否以错误的方式使用整个多处理模块?
答案 0 :(得分:1)
我确实看到你发布的其他任何明显错误但你应该做的事情:在if __name__=="main":
块中启动新流程,请参阅programming guideline。
from scipy import stats
import itertools
import multiprocessing as mp
def calc3PointsList( points,output ):
xy = []
r = []
for point in points:
// do stuff with points and append results to xy and r
output.put( (xy, r) )
if __name__ == "__main__":
output = mp.Queue()
random_points = [ (np.array((stats.uniform(-0.5,1).rvs(),stats.uniform(-0.5,1).rvs()))) for _ in range(18)]
combList = list(itertools.combinations(random_points, 3))
N = 6
processes = [mp.Process(target=calc3PointsList, args=(combList[(i-1)*len(combList)/(N-1):i*len(combList)/(N-1)],output)) for i in range(1,N)]
for p in processes:
p.start()
for p in processes:
p.join()
results = [output.get for x in range(output.qsize())]