Python What is the difference between a Pool of worker processes and just running multiple Processes?

时间:2015-10-30 22:08:45

标签: python process multiprocessing pool worker

I am not sure when to use pool of workers vs multiple processes.

processes = []

for m in range(1,5):
       p = Process(target=some_function)
       p.start()
       processes.append(p)

for p in processes:
       p.join()

vs

if __name__ == '__main__':
    # start 4 worker processes
    with Pool(processes=4) as pool:
        pool_outputs = pool.map(another_function, inputs)

2 个答案:

答案 0 :(得分:5)

As it says on PYMOTW:

The Pool class can be used to manage a fixed number of workers for simple cases where the work to be done can be broken up and distributed between workers independently.

The return values from the jobs are collected and returned as a list.

The pool arguments include the number of processes and a function to run when starting the task process (invoked once per child).

Please have a look at the examples given there to better understand its application, functionalities and parameters.

Basically the Pool is a helper, easing the management of the processes (workers) in those cases where all they need to do is consume common input data, process it in parallel and produce a joint output.

The Pool does quite a few things that otherwise you should code yourself (not too hard, but still, it's convenient to find a pre-cooked solution)

i.e.

  • the splitting of the input data
  • the target process function is simplified: it can be designed to expect one input element only. The Pool is going to call it providing each element from the subset allocated to that worker
  • waiting for the workers to finish their job (i.e. joining the processes)
  • ...
  • merging the output of each worker to produce the final output

答案 1 :(得分:0)

以下信息可能有助于您了解Python多处理类中的池和进程之间的区别:

游泳池:

  1. 当您有大量数据时,可以使用Pool类。
  2. 只有正在执行的进程才保留在内存中。
  3. I / O操作:等待直到I / O操作完成并且不安排其他进程。这可能会增加执行时间。
  4. 使用FIFO调度程序。

过程:

  1. 当您的数据或功能较少而要做的重复性工作较少时。
  2. 它将所有进程放入内存。因此,在较大的任务中,这可能会导致内存丢失。
  3. I / O操作:进程类挂起执行I / O操作的进程,并并行调度另一个进程。
  4. 使用FIFO调度程序。