键盘中断python的多处理池

时间:2009-09-10 23:59:36

标签: python multiprocessing pool keyboardinterrupt

如何使用python的多处理池处理KeyboardInterrupt事件?这是一个简单的例子:

from multiprocessing import Pool
from time import sleep
from sys import exit

def slowly_square(i):
    sleep(1)
    return i*i

def go():
    pool = Pool(8)
    try:
        results = pool.map(slowly_square, range(40))
    except KeyboardInterrupt:
        # **** THIS PART NEVER EXECUTES. ****
        pool.terminate()
        print "You cancelled the program!"
        sys.exit(1)
    print "\nFinally, here are the results: ", results

if __name__ == "__main__":
    go()

运行上面的代码时,KeyboardInterrupt会在我按下^C时被引发,但是此过程只是挂起,我必须在外部将其删除。

我希望能够随时按^C并使所有进程正常退出。

11 个答案:

答案 0 :(得分:129)

这是一个Python错误。在等待threading.Condition.wait()中的条件时,从不发送KeyboardInterrupt。 REPRO:

import threading
cond = threading.Condition(threading.Lock())
cond.acquire()
cond.wait(None)
print "done"

在wait()返回之前,不会传递KeyboardInterrupt异常,并且它永远不会返回,因此中断永远不会发生。 KeyboardInterrupt几乎肯定会中断条件等待。

请注意,如果指定超时,则不会发生这种情况; cond.wait(1)将立即收到中断。因此,解决方法是指定超时。为此,请替换

    results = pool.map(slowly_square, range(40))

    results = pool.map_async(slowly_square, range(40)).get(9999999)

或类似。

答案 1 :(得分:46)

根据我最近发现的,最好的解决方案是将工作进程设置为完全忽略SIGINT,并将所有清理代码限制在父进程中。这解决了空闲和繁忙工作进程的问题,并且不需要子进程中的错误处理代码。

import signal

...

def init_worker():
    signal.signal(signal.SIGINT, signal.SIG_IGN)

...

def main()
    pool = multiprocessing.Pool(size, init_worker)

    ...

    except KeyboardInterrupt:
        pool.terminate()
        pool.join()

可以分别在http://noswap.com/blog/python-multiprocessing-keyboardinterrupt/http://github.com/jreese/multiprocessing-keyboardinterrupt找到解释和完整示例代码。

答案 2 :(得分:25)

由于某些原因,只能从基类Exception类继承的异常正常处理。要解决此问题,您可以将KeyboardInterrupt重新提升为Exception个实例:

from multiprocessing import Pool
import time

class KeyboardInterruptError(Exception): pass

def f(x):
    try:
        time.sleep(x)
        return x
    except KeyboardInterrupt:
        raise KeyboardInterruptError()

def main():
    p = Pool(processes=4)
    try:
        print 'starting the pool map'
        print p.map(f, range(10))
        p.close()
        print 'pool map complete'
    except KeyboardInterrupt:
        print 'got ^C while pool mapping, terminating the pool'
        p.terminate()
        print 'pool is terminated'
    except Exception, e:
        print 'got exception: %r, terminating the pool' % (e,)
        p.terminate()
        print 'pool is terminated'
    finally:
        print 'joining pool processes'
        p.join()
        print 'join complete'
    print 'the end'

if __name__ == '__main__':
    main()

通常你会得到以下输出:

staring the pool map
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
pool map complete
joining pool processes
join complete
the end

因此,如果您点击^C,您将获得:

staring the pool map
got ^C while pool mapping, terminating the pool
pool is terminated
joining pool processes
join complete
the end

答案 3 :(得分:7)

通常这个简单的结构适用于池上的 Ctrl - C

def signal_handle(_signal, frame):
    print "Stopping the Jobs."

signal.signal(signal.SIGINT, signal_handle)

如同几篇类似的帖子所述:

Capture keyboardinterrupt in Python without try-except

答案 4 :(得分:5)

似乎有两个问题在多处理烦人时会产生异常。第一个(由Glenn指出)是你需要使用map_async超时而不是map以获得立即响应(即,不完成处理整个列表)。第二个(由Andrey指出)是多处理不会捕获不会从Exception继承的异常(例如,SystemExit)。所以这是我的解决方案,它涉及这两个方面:

import sys
import functools
import traceback
import multiprocessing

def _poolFunctionWrapper(function, arg):
    """Run function under the pool

    Wrapper around function to catch exceptions that don't inherit from
    Exception (which aren't caught by multiprocessing, so that you end
    up hitting the timeout).
    """
    try:
        return function(arg)
    except:
        cls, exc, tb = sys.exc_info()
        if issubclass(cls, Exception):
            raise # No worries
        # Need to wrap the exception with something multiprocessing will recognise
        import traceback
        print "Unhandled exception %s (%s):\n%s" % (cls.__name__, exc, traceback.format_exc())
        raise Exception("Unhandled exception: %s (%s)" % (cls.__name__, exc))

def _runPool(pool, timeout, function, iterable):
    """Run the pool

    Wrapper around pool.map_async, to handle timeout.  This is required so as to
    trigger an immediate interrupt on the KeyboardInterrupt (Ctrl-C); see
    http://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool

    Further wraps the function in _poolFunctionWrapper to catch exceptions
    that don't inherit from Exception.
    """
    return pool.map_async(functools.partial(_poolFunctionWrapper, function), iterable).get(timeout)

def myMap(function, iterable, numProcesses=1, timeout=9999):
    """Run the function on the iterable, optionally with multiprocessing"""
    if numProcesses > 1:
        pool = multiprocessing.Pool(processes=numProcesses, maxtasksperchild=1)
        mapFunc = functools.partial(_runPool, pool, timeout)
    else:
        pool = None
        mapFunc = map
    results = mapFunc(function, iterable)
    if pool is not None:
        pool.close()
        pool.join()
    return results

答案 5 :(得分:4)

投票的答案不是解决核心问题,而是类似的副作用。

多处理库的作者Jesse Noller解释了如何在旧版This one中使用multiprocessing.Pool时正确处理CTRL + C.

import signal
from multiprocessing import Pool


def initializer():
    """Ignore CTRL+C in the worker process."""
    signal.signal(signal.SIGINT, signal.SIG_IGN)


pool = Pool(initializer=initializer)

try:
    pool.map(perform_download, dowloads)
except KeyboardInterrupt:
    pool.terminate()
    pool.join()

答案 6 :(得分:3)

我发现,目前最好的解决方案是不使用multiprocessing.pool功能,而是推出自己的池功能。我提供了一个使用apply_async演示错误的示例,以及一个示例,说明如何完全避免使用池功能。

http://www.bryceboe.com/2010/08/26/python-multiprocessing-and-keyboardinterrupt/

答案 7 :(得分:1)

我是Python的新手。我到处寻找答案,偶然发现了这个以及其他一些博客和YouTube视频。我试图复制粘贴上面的作者代码并在我的python 2.7.13中在Windows 7 64位上重现它。它接近我想达到的目标。

我让我的子进程忽略ControlC并使父进程终止。看起来绕过子进程确实可以避免这个问题。

#!/usr/bin/python

from multiprocessing import Pool
from time import sleep
from sys import exit


def slowly_square(i):
    try:
        print "<slowly_square> Sleeping and later running a square calculation..."
        sleep(1)
        return i * i
    except KeyboardInterrupt:
        print "<child processor> Don't care if you say CtrlC"
        pass


def go():
    pool = Pool(8)

    try:
        results = pool.map(slowly_square, range(40))
    except KeyboardInterrupt:
        pool.terminate()
        pool.close()
        print "You cancelled the program!"
        exit(1)
    print "Finally, here are the results", results


if __name__ == '__main__':
    go()

pool.terminate()开始的部分似乎永远不会执行。

答案 8 :(得分:0)

您可以尝试使用Pool对象的apply_async方法,如下所示:

import multiprocessing
import time
from datetime import datetime


def test_func(x):
    time.sleep(2)
    return x**2


def apply_multiprocessing(input_list, input_function):
    pool_size = 5
    pool = multiprocessing.Pool(processes=pool_size, maxtasksperchild=10)

    try:
        jobs = {}
        for value in input_list:
            jobs[value] = pool.apply_async(input_function, [value])

        results = {}
        for value, result in jobs.items():
            try:
                results[value] = result.get()
            except KeyboardInterrupt:
                print "Interrupted by user"
                pool.terminate()
                break
            except Exception as e:
                results[value] = e
        return results
    except Exception:
        raise
    finally:
        pool.close()
        pool.join()


if __name__ == "__main__":
    iterations = range(100)
    t0 = datetime.now()
    results1 = apply_multiprocessing(iterations, test_func)
    t1 = datetime.now()
    print results1
    print "Multi: {}".format(t1 - t0)

    t2 = datetime.now()
    results2 = {i: test_func(i) for i in iterations}
    t3 = datetime.now()
    print results2
    print "Non-multi: {}".format(t3 - t2)

输出:

100
Multiprocessing run time: 0:00:41.131000
100
Non-multiprocessing run time: 0:03:20.688000

此方法的优点是中断之前处理的结果将返回到结果字典中:

>>> apply_multiprocessing(range(100), test_func)
Interrupted by user
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

答案 9 :(得分:0)

如果您正在执行诸如 Pool.map 之类的方法,其中许多答案是旧的和/或它们似乎不适用于 Windows 上 更高版本的 Python(我正在运行 3.8.5),在所有提交的任务完成之前,它会阻塞。以下是我的解决方案。

  1. 在进程池中的每个进程中调用 signal.signal(signal.SIGINT, signal.SIG_IGN) 以完全忽略中断并将处理留给主进程。

  2. 使用方法 Pool.imap(或 Pool.imap_unordered)代替 Pool.map,后者会延迟评估您的 iterable 参数以提交任务和处理结果。通过这种方式,它 (a) 不会阻塞等待所有结果,并且作为附带好处 (b) 您可以节省内存,因为您的 iterable 现在可以是生成器函数或表达式。

  3. 关键是让主进程定期和频繁地发布打印报表,例如报告提交的任务正在完成的进度。这是感知键盘中断所必需的。在下面的代码中,每完成 N 个任务,其中 N 为 100,就会打印已完成任务的数量。这个想法是根据您的个人工作函数选择 N,以便足够频繁地打印完成计数消息,以便输入 Ctrl-c 后,您不必等待太长时间中断生效。当然,您也可以使用进度条,例如 tqdm 存储库中可用的 PyPi 包提供的进度条。

from multiprocessing import Pool
import signal

def init_pool():
    signal.signal(signal.SIGINT, signal.SIG_IGN)

def worker(x):
    import time
    # processes the number
    time.sleep(.2)

if __name__ == '__main__':
    with Pool(initializer=init_pool) as pool:
        try:
            tasks_completed = 0
            result = []
            for return_value in pool.imap(worker, range(1000)):
                tasks_completed += 1
                if tasks_completed % 10 == 0:
                    print('Tasks completed =', tasks_completed, end='\r')
                result.append(return_value)
        except KeyboardInterrupt:
            print('\nCtrl-c entered.')
        else:
            print()

答案 10 :(得分:-4)

奇怪的是,看起来你必须处理孩子们的KeyboardInterrupt。我希望这可以按照书面形式工作......尝试将slowly_square更改为:

def slowly_square(i):
    try:
        sleep(1)
        return i * i
    except KeyboardInterrupt:
        print 'You EVIL bastard!'
        return 0

这应该按照你的预期工作。