使用`event`进行多处理暂停 - 重启功能

时间:2014-06-07 18:35:57

标签: python multiprocessing

我使用下面发布的代码为multiprocessing池启用暂停 - 重启功能。

如果您解释为什么event变量必须作为setup()函数的参数发送,我将不胜感激。为什么然后在unpaused函数的范围内声明全局变量setup(),然后将其设置为与event变量相同:

def setup(event):
    global unpaused
    unpaused = event

我也想知道以下声明背后的逻辑:

pool=mp.Pool(2, setup, (event,))

提交的第一个参数是Pool要使用的CPU核心数。 提交的第二个参数是上面提到的函数setup()

为什么不能像以下一样完成:

global event
event=mp.Event()
pool = mp.Pool(processes=2)

每当我们需要暂停或重新启动作业时,我们只会使用:

暂停:

event.clear()

要重新启动:

event.set()

为什么我们需要一个全局变量unpaused?我不明白!请指教。


import time
import multiprocessing as mp

def myFunct(arg):
    proc=mp.current_process()
    print 'starting:', proc.name, proc.pid,'...\n'
    for i in range(110):
        for n in range(500000):
            pass
    print '\t ...', proc.name, proc.pid, 'completed\n'

def setup(event):
    global unpaused
    unpaused = event

def pauseJob():
    event.clear()

def continueJob():
    event.set()


event=mp.Event()

pool=mp.Pool(2, setup, (event,))
pool.map_async(myFunct, [1,2,3])

event.set()

pool.close()
pool.join()

1 个答案:

答案 0 :(得分:14)

您误解了Event的工作原理。但首先,我将介绍setup正在做的事情。

setup函数在池中的每个子进程中启动后立即执行。因此,您在每个进程中设置一个名为event的全局变量,使其成为您在主进程中创建的同一multiprocessing.Event对象。最终,每个子流程都有一个名为event的全局变量,它引用了同一个multiprocessing.Event对象。这将允许您从主进程发出子进程的信号,就像您想要的那样。见这个例子:

import multiprocessing

event = None
def my_setup(event_):
  global event
  event = event_
  print "event is %s in child" % event


if __name__ == "__main__":
    event = multiprocessing.Event()
    p = multiprocessing.Pool(2, my_setup, (event,))
    print "event is %s in parent" % event
    p.close()
    p.join()

输出:

dan@dantop2:~$ ./mult.py 
event is <multiprocessing.synchronize.Event object at 0x7f93cd7a48d0> in child
event is <multiprocessing.synchronize.Event object at 0x7f93cd7a48d0> in child
event is <multiprocessing.synchronize.Event object at 0x7f93cd7a48d0> in parent

正如您所看到的,它在两个子进程和父进程中都是相同的event。就像你想要的那样。

但是,将event传递到设置实际上并不是必需的。您可以从父进程继承event实例:

import multiprocessing

event = None

def my_worker(num):
    print "event is %s in child" % event

if __name__ == "__main__":
    event = multiprocessing.Event()
    pool = multiprocessing.Pool(2)
    pool.map_async(my_worker, [i for i in range(pool._processes)]) # Just call my_worker for every process in the pool.

    pool.close()
    pool.join()
    print "event is %s in parent" % event

输出:

dan@dantop2:~$ ./mult.py 
event is <multiprocessing.synchronize.Event object at 0x7fea3b1dc8d0> in child
event is <multiprocessing.synchronize.Event object at 0x7fea3b1dc8d0> in child
event is <multiprocessing.synchronize.Event object at 0x7fea3b1dc8d0> in parent

这更简单,是在父级和子级之间传递信号量的首选方法。事实上,如果您尝试将event直接传递给工作人员,则会出现错误:

RuntimeError: Semaphore objects should only be shared between processes through inheritance

现在,回到你如何误解Event的工作方式。 Event意味着像这样使用:

import time
import multiprocessing

def event_func(num):
    print '\t%r is waiting' % multiprocessing.current_process()
    event.wait()
    print '\t%r has woken up' % multiprocessing.current_process()

if __name__ == "__main__":
    event = multiprocessing.Event()

    pool = multiprocessing.Pool()
    a = pool.map_async(event_func, [i for i in range(pool._processes)])

    print 'main is sleeping'
    time.sleep(2)

    print 'main is setting event'
    event.set()

    pool.close()
    pool.join()

输出:

main is sleeping
    <Process(PoolWorker-1, started daemon)> is waiting
    <Process(PoolWorker-2, started daemon)> is waiting
    <Process(PoolWorker-4, started daemon)> is waiting
    <Process(PoolWorker-3, started daemon)> is waiting
main is setting event
    <Process(PoolWorker-2, started daemon)> has woken up
    <Process(PoolWorker-1, started daemon)> has woken up
    <Process(PoolWorker-4, started daemon)> has woken up
    <Process(PoolWorker-3, started daemon)> has woken up

如您所见,子进程需要显式调用event.wait()才能暂停它们。在主进程中调用event.set时,它们会被取消暂停。现在你的工人都没有打电话给event.wait,所以他们都不会被暂停。我建议您查看threading.Event的文档,multiprocessing.Event重复这些文档。