一个python多进程错误

时间:2010-10-27 05:40:28

标签: python multiprocess

我在这里有一个多进程演示,我遇到了一些问题。研究了一晚,我无法解决原因。 任何人都可以帮助我吗?

我希望有一个父进程充当生产者,当有任务到来时,父进程可以分叉一些子进程来使用这些任务。父监视子进程,如果有异常退出,则可以由父进程重新启动。


#!/usr/bin/env python
# -*- coding: utf-8 -*-

from multiprocessing import Process, Queue from Queue import Empty import sys, signal, os, random, time import traceback

child_process = []
child_process_num = 4
queue = Queue(0)

def work(queue):
    signal.signal(signal.SIGINT, signal.SIG_DFL)
    signal.signal(signal.SIGTERM, signal.SIG_DFL)
    signal.signal(signal.SIGCHLD, signal.SIG_DFL)

    time.sleep(10) #demo sleep 

def kill_child_processes(signum, frame):
    #terminate all children
    pass

def restart_child_process(signum, frame):
    global child_process

    for i in xrange(len(child_process)):
        child = child_process[i]

        try:
            if child.is_alive():
                continue
        except OSError, e:
            pass

        child.join() #join this process to make sure there is no zombie process

        new_child = Process(target=work, args=(queue,))
        new_child.start()
        child_process[i] = new_child #restart one new process

        child = None
        return

if __name__ == '__main__':
    reload(sys)
    sys.setdefaultencoding("utf-8")

    for i in xrange(child_process_num):
        child = Process(target=work, args=(queue,))
        child.start()
        child_process.append(child)

    signal.signal(signal.SIGINT, kill_child_processes)
    signal.signal(signal.SIGTERM, kill_child_processes) #hook the SIGTERM
    signal.signal(signal.SIGCHLD, restart_child_process)
    signal.signal(signal.SIGPIPE, signal.SIG_DFL)

当此程序运行时,将出现如下错误:

Error in atexit._run_exitfuncs:
Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/local/python/lib/python2.6/atexit.py", line 30, in _run_exitfuncs
    traceback.print_exc()
  File "/usr/local/python/lib/python2.6/traceback.py", line 227, in print_exc
    print_exception(etype, value, tb, limit, file)
  File "/usr/local/python/lib/python2.6/traceback.py", line 124, in print_exception
    _print(file, 'Traceback (most recent call last):')
  File "/usr/local/python/lib/python2.6/traceback.py", line 12, in _print
    def _print(file, str='', terminator='\n'):
  File "test.py", line 42, in restart_child_process
    new_child.start()
  File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 99, in start
    _cleanup()
  File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
    if p._popen.poll() is not None:
  File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes

如果我向一个孩子发送信号:kill -SIGINT {child_pid}我会得到:

[root@mail1 mail]# kill -SIGINT 32545
[root@mail1 mail]# Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/local/python/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/local/python/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
    p.join()
  File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 119, in join
    res = self._popen.wait(timeout)
  File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 117, in wait
    return self.poll(0)
  File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 4] Interrupted system call Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/local/python/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/local/python/lib/python2.6/multiprocessing/util.py", line 269, in _exit_function
    p.join()
  File "/usr/local/python/lib/python2.6/multiprocessing/process.py", line 119, in join
    res = self._popen.wait(timeout)
  File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 117, in wait
    return self.poll(0)
  File "/usr/local/python/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 4] Interrupted system call

1 个答案:

答案 0 :(得分:1)

主进程在退出之前等待所有子进程终止,因此有一个阻塞调用(即wait4)注册为at_exit句柄。您发送的信号会中断阻塞调用,从而中断堆栈跟踪。

我不清楚的是,如果发送给孩子的信号将被重定向到父进程,然后中断该wait4调用。这与Unix进程组行为有关。