在多进程模块中运行子进程的顺序

时间:2018-01-11 13:36:35

标签: python multiprocessing

使用for循环启动多进程。

import os
from multiprocessing import Process

def run_proc(name):
    print('child process %s (%s) running ...' %(name,os.getpid()))

if __name__ == '__main__':
    print('parent process %s.' %os.getppid())
    for i in range(5):
        p = Process(target=run_proc,args=(str(i),))
        print('process will start'+str(i))
        p.start()
    p.join()
    print('process is end')

我得到了结果。

parent process 6497.  
process will start  
process will start  
child process 0 (6984) running ...  
process will start  
process will start  
process will start  
child process 2 (6986) running ...  
child process 1 (6985) running ...  
child process 3 (6987) running ...  
child process 4 (6988) running ...  
process is end  

为什么稍后创建的子流程会在以后执行? 为什么不能得到以下结果?

parent process 6497.  
process will start  
process will start  
child process 0 (6984) running ...  
process will start  
process will start  
process will start  
child process 1 (6986) running ...  
child process 2 (6985) running ...  
child process 3 (6987) running ...  
child process 4 (6988) running ...  
process is end

Jean-FrançoisFabre所说的是如何创造以下结果:

parent process 6497.
process will start
child process 0 (9639) running ...
process will start
child process 1 (9640) running ...
process will start
child process 2 (9641) running ...
process will start
child process 3 (9643) running ...
process will start
child process 4 (9644) running ...
process is end

我只能在for循环中更改p.join,如下所示:

import os
from multiprocessing import Process

def run_proc(name):
    print('child process %s (%s) running ...' %(name,os.getpid()))

if __name__ == '__main__':
    print('parent process %s.' %os.getppid())
    for i in range(5):
        p = Process(target=run_proc,args=(str(i),))
        print('process will start'+str(i))
        p.start()
        p.join()
    print('process is end')

我想知道的是为什么我的代码会产生以下输出

parent process 6497.  
process will start  
process will start  
child process 0 (6984) running ...  
process will start  
process will start  
process will start  
child process 2 (6986) running ...  
child process 1 (6985) running ...  
child process 3 (6987) running ...  
child process 4 (6988) running ...  
process is end  

而不是:

parent process 6497.  
process will start  
process will start  
child process 0 (6984) running ...  
process will start  
process will start  
process will start  
child process 1 (6986) running ...  
child process 2 (6985) running ...  
child process 3 (6987) running ...  
child process 4 (6988) running ...  
process is end

这是一个不同的问题。

1 个答案:

答案 0 :(得分:8)

在创建流程的主流程和尝试启动的流程(以及子流程本身)之间有一个race condition

一旦从主进程发出p.start()命令,子进程就可以运行(并打印)。但主要流程也在努力创建 next 子进程。谁将首先打印下一行?很难知道。如果主进程成功创建了下一个子进程,那么现在竞争条件是两个孩子之间:你正在经历的事情。

进程可以并行运行,但在调用操作系统时它们仍然具有同步点。首先到达操作系统的人首先获得服务(例如:打印到控制台)。

当然,将p.join() 放在循环中会恢复顺序,它也会取消多处理的效果,因为主进程等到子进程结束才创建另一

这通常没关系,因为你正在做一些并行的任务。

我首先在列表推导中创建流程,然后循环启动它们,稍微延迟以确保流程开始&在创建下一个流程之前打印。

process_list = [Process(target=run_proc,args=(str(i),)) for i in range(5)]
for i,p in enumerate(process_list):
    print('process {} will start'.format(i))
    p.start()
    time.sleep(0.1)

当主要过程等待一段时间时,这为儿童过程提供了喘息的空间。打印。

另请注意,您的上一个p.join()仅加入循环的最后一个过程,应该是(现在使用我们全新的process_list):

for p in process_list:
    p.join()

在大多数情况下,按顺序开始/结束并不重要。您可以预先计算主流程中的所有订购信息(就像通过为流程名称分配不断增加的数字一样)。

请注意,经典问题可能不是确保进程按顺序启动,而是它们产生的结果可以与您提供的输入匹配(将进程传递给进程,并生成输出列表,所以最后你知道哪个输入提供了哪个输出)。

在这种情况下,请查找multiprocessing.poolmap函数(Python multiprocessing.pool sequential run of processes