Python中带有嵌套循环的并行处理

时间:2019-11-19 12:07:44

标签: python pandas parallel-processing multiprocessing

由于性能问题,我想在python中并行运行我的函数:

import multiprocessing as mp

source_nodes = [10413173,    10414530,   10414530,   10437199]
sink_nodes =  [10420346,     10438770,   10438711,   10414530,   10436258]
path =[]    


def createpath(source,sink):
    for i in source:
        for j in sink:
            path = path + list(nx.all_simple_paths(Directed_G,i,j))
    return path

以我的理解,我必须给1迭代应用功能。但我的想法是做类似的事情:

results = [pool.apply(createpath, args=(source_nodes, sink_nodes))]

然后不给任何可迭代对象套用function 我设法使其正常运行,但我认为它不能并行运行。

您认为我应该在第一个循环中包含apply函数吗?

1 个答案:

答案 0 :(得分:2)

from multiprocessing import Pool


source_nodes = [1,2,3,4,5,6]
sink_nodes =  [1,1,1,1,1,1,1,1,1]


def sum_values(parameter_tuple):
    source,sink, start, stop = parameter_tuple
    out = 0
    for i in range(start, stop):
        val_i = source[i]
        for j in sink:
            out += val_i*j
    return out

if __name__ == "__main__":
    params = (source_nodes, sink_nodes, 0, 6)
    print(sum_values(params))
    with Pool(2) as p:
        print(p.map(sum_values, [
            (source_nodes, sink_nodes, 0, 3),
            (source_nodes, sink_nodes, 3, 6),
        ]))

您可以尝试运行此程序。这与2个线程池上的映射模式并行运行。在这种情况下,您的输出结果是池中每个进程的结果之和。