Question

使用multiprocessing.pool，我可以为一个要分割的功能分割一个输入列表，以在多个CPU上并行处理。像这样：

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    results = pool.map(f, range(100))
    pool.close()
    pool.join()

但是，这不允许在不同的处理器上运行不同的功能。如果我想同时执行以下操作：

foo1(args1) --> Processor1
foo2(args2) --> Processor2

这怎么办？

编辑：在Darkonaut发言之后，我并不关心将foo1专门分配给1号处理器。它可以是操作系统选择的任何处理器。我只是对在不同/并行进程中运行独立功能感兴趣。而是：

foo1(args1) --> process1
foo2(args2) --> process2

Answer 1

我通常发现使用the concurrent.futures module进行并发是最容易的。您可以使用multiprocessing来达到相同的目的，但是concurrent.futures具有（IMO）更好的界面。

您的示例将是：

from concurrent.futures import ProcessPoolExecutor


def foo1(x):
    return x * x


def foo2(x):
    return x * x * x


if __name__ == '__main__':
    with ProcessPoolExecutor(2) as executor:
        # these return immediately and are executed in parallel, on separate processes
        future_1 = executor.submit(foo1, 1)
        future_2 = executor.submit(foo2, 2)
    # get results / re-raise exceptions that were thrown in workers
    result_1 = future_1.result()  # contains foo1(1)
    result_2 = future_2.result()  # contains foo2(2)

如果输入很多，最好将executor.map参数与chunksize一起使用：

from concurrent.futures import ProcessPoolExecutor


def foo1(x):
    return x * x


def foo2(x):
    return x * x * x


if __name__ == '__main__':
    with ProcessPoolExecutor(4) as executor:
        # these return immediately and are executed in parallel, on separate processes
        future_1 = executor.map(foo1, range(10000), chunksize=100)
        future_2 = executor.map(foo2, range(10000), chunksize=100)
    # executor.map returns an iterator which we have to consume to get the results
    result_1 = list(future_1)  # contains [foo1(x) for x in range(10000)]
    result_2 = list(future_2)  # contains [foo2(x) for x in range(10000)]

请注意，chunksize的最佳值，进程数以及基于进程的并发是否实际上导致性能提高取决于许多因素：

foo1 / foo2的运行时。如果它们非常便宜（如本例所示），则进程之间的通信开销可能会占据整个运行时的时间。
产生一个进程需要花费时间，因此with ProcessPoolExecutor中的代码需要运行足够长的时间才能摊销。
您正在运行的计算机上的物理处理器的实际数量。
您的应用程序是IO绑定还是计算绑定。
您在foo中使用的功能是否已经并行化（例如某些np.linalg求解器或scikit-learn估计器）。

在单独的CPU中运行不同的Python函数

1 个答案: