Pool.map - 为什么工作进程没有提前崩溃?

时间:2014-10-09 14:46:05

标签: python python-2.7 multiprocessing

说我这样做:

import multiprocessing as mp

def f(x):
    raise OverflowError  # raised BEFORE the print
    print x

if __name__ == '__main__':
    pool = mp.Pool(processes=1)
    for _ in pool.imap_unordered(f, range(10)):
        pass
    pool.close()
    pool.join()

输出:

Traceback (most recent call last):
  File "test0.py", line 9, in <module>
    for _ in pool.imap_unordered(f, range(10)):
  File "/Users/usualme/anaconda/lib/python2.7/multiprocessing/pool.py", line 659, in next
    raise value
OverflowError

好的输出很有意义。在print语句之前引发异常,因此没有输出。现在几乎相同的代码,但我换了2行:

import multiprocessing as mp

def f(x):
    print x
    raise OverflowError  # raised AFTER the print

if __name__ == '__main__':
    pool = mp.Pool(processes=1)
    for _ in pool.imap_unordered(f, range(10)):
        pass
    pool.close()
    pool.join()

输出:

0
1
2
3
4
5
6
7
8
9
Traceback (most recent call last):
  File "test0.py", line 9, in <module>
    for _ in pool.imap_unordered(f, range(10)):
  File "/Users/usualme/anaconda/lib/python2.7/multiprocessing/pool.py", line 659, in next
    raise value
OverflowError

我不理解输出。我期待数字0后跟堆栈跟踪,或者所有10个数字和10个堆栈跟踪。为什么打印所有数字而只打印一个堆栈跟踪?为什么工作进程会在最后崩溃?

1 个答案:

答案 0 :(得分:3)

这只是一个时间问题 - worker进程并不关心在它运行的函数中引发异常,它只是将异常返回给父进程并继续跟随下一个任务。这是它正在运行的循环(略微简化):

while maxtasks is None or (maxtasks and completed < maxtasks):
    try:
        task = get()  # Get task from parent
    except (EOFError, OSError):
        util.debug('worker got EOFError or OSError -- exiting')
        break

    if task is None:
        util.debug('worker got sentinel -- exiting')
        break

    job, i, func, args, kwds = task
    try:
        result = (True, func(*args, **kwds))  # Call the function pass from the parent
    except Exception as e:  # We end up in here if the worker raises an exception
        if wrap_exception:
            e = ExceptionWithTraceback(e, e.__traceback__)
        result = (False, e)  # The exception object is stored as the result

    put((job, i, result)) # Send result to parent process

因此,即使第一个任务引发异常,结果也需要一点时间在两个进程之间传递,并且父进程实际将结果从Queue中拉出来并提出Exception。在那个时间窗口中,工作人员能够执行所有剩余的任务。如果你使worker函数变慢,你会看到它执行的任务更少:

import multiprocessing as mp
import time

def f(x):
    print x
    time.sleep(2)
    raise OverflowError 

if __name__ == '__main__':
    pool = mp.Pool(processes=1)
    for _ in pool.imap_unordered(f, range(10)):
        pass
    pool.close()
    pool.join()

输出:

0
1
Traceback (most recent call last):
  File "p.py", line 11, in <module>
    for _ in pool.imap_unordered(f, range(10)):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 626, in next
    raise value
OverflowError

如果你传递了一个更大的迭代,你也只能看到一些百分比的打印结果,因为在父母去世之前,工人没有足够的时间来完成所有这些。

您只看到实际引发的一个异常,因为从父级的角度来看,只要一个任务失败,就应该中止整个imap调用。父对象从单个Queue顺序提取其所有子进程的结果,因此只要它看到第一个异常,imap调用就会结束,因此其余任务的结果将被抛出程。