检查multiprocessing.Pool中的进程是否已崩溃

时间:2015-10-19 20:26:05

标签: python segmentation-fault multiprocessing

我正在使用非常罕见崩溃的外部C库运行一些代码。我使用multiprocessing.Pool包装此代码以并行运行它。我希望能够检测池中的某个进程是否已经分段。从this question,开始,我觉得我需要使用multiprocessing.Process重新实现池,以便我可以检查is_alive(),但我不知道该怎么做。

例如:

import multiprocessing
import time

def fn(arg):
    if arg == 4:
        # this should segfault
        # http://codegolf.stackexchange.com/questions/4399/shortest-code-that-return-sigsegv
        import ctypes;ctypes.string_at(0)

    time.sleep(2)    
    return arg**2

def main():
    pool = multiprocessing.Pool()

    inputs = range(10)
    results = pool.map_async(fn, inputs)

    while True:
        if results.ready():
            break

        time.sleep(0.5)
        print "status: still running..."

        # ... some async code ...
        # detect failed process here?

    outputs = results.get()

    print outputs


if __name__ == '__main__':
    main()

我怎样才能重写这个以检测段错误? (另外,我尝试使用python 3.5.0运行它,它似乎没有像其他问题中提出的那样引发异常。)

1 个答案:

答案 0 :(得分:0)

pool._pool是进程列表,作为multiprocessing.Process对象。

您可以查看他们的返回代码

pool._pool[0].return_code

请参阅:https://docs.python.org/2/library/multiprocessing.html#process-and-exceptions

评论后编辑

import multiprocessing
import time

def fn(arg):
    if arg == 4:
        # this should segfault
        # http://codegolf.stackexchange.com/questions/4399/shortest-code-that-return-sigsegv
        #pass
        import ctypes;ctypes.string_at(0)

    time.sleep(1)
    return (arg, arg**2)

def main():
    MAX_TOTAL_DURATION = 40
    MAX_TASK_DURATION = 10
    pool = multiprocessing.Pool()

    inputs = range(10)
    start_time = time.time()
    results_iterator = pool.imap_unordered(fn, inputs, 1)
    got_result_for = []
    while len(got_result_for) < len(inputs) and (time.time() - start_time) < MAX_TOTAL_DURATION:
        try:
            argument, result = results_iterator.next(MAX_TASK_DURATION)
            got_result_for.append(argument)
            print("status: got results for: " + str(got_result_for))
        except multiprocessing.TimeoutError:
            print("one of the tasks timeouted")

    if len(got_result_for) < len(inputs):
        print("missing results: " + str(set(inputs).difference(set(got_result_for))))



if __name__ == '__main__':
    main()