我正在编写一个守护程序,它会产生其他几个子进程。运行stop
脚本后,主进程在打算退出时继续运行,这让我很困惑。
import daemon, signal
from multiprocessing import Process, cpu_count, JoinableQueue
from http import httpserv
from worker import work
class Manager:
"""
This manager starts the http server processes and worker
processes, creates the input/output queues that keep the processes
work together nicely.
"""
def __init__(self):
self.NUMBER_OF_PROCESSES = cpu_count()
def start(self):
self.i_queue = JoinableQueue()
self.o_queue = JoinableQueue()
# Create worker processes
self.workers = [Process(target=work,
args=(self.i_queue, self.o_queue))
for i in range(self.NUMBER_OF_PROCESSES)]
for w in self.workers:
w.daemon = True
w.start()
# Create the http server process
self.http = Process(target=httpserv, args=(self.i_queue, self.o_queue))
self.http.daemon = True
self.http.start()
# Keep the current process from returning
self.running = True
while self.running:
time.sleep(1)
def stop(self):
print "quiting ..."
# Stop accepting new requests from users
os.kill(self.http.pid, signal.SIGINT)
# Waiting for all requests in output queue to be delivered
self.o_queue.join()
# Put sentinel None to input queue to signal worker processes
# to terminate
self.i_queue.put(None)
for w in self.workers:
w.join()
self.i_queue.join()
# Let main process return
self.running = False
import daemon
manager = Manager()
context = daemon.DaemonContext()
context.signal_map = {
signal.SIGHUP: lambda signum, frame: manager.stop(),
}
context.open()
manager.start()
stop
脚本只是一行os.kill(pid, signal.SIGHUP)
,但在此之后,子进程(工作进程和http服务器进程)很好地结束,但主进程只停留在那里,我不是知道阻止它返回的原因。
答案 0 :(得分:1)
我尝试了一种不同的方法,这似乎有效(注意我拿出代码的守护进程部分,因为我没有安装该模块)。
import signal
class Manager:
"""
This manager starts the http server processes and worker
processes, creates the input/output queues that keep the processes
work together nicely.
"""
def __init__(self):
self.NUMBER_OF_PROCESSES = cpu_count()
def start(self):
# all your code minus the loop
print "waiting to die"
signal.pause()
def stop(self):
print "quitting ..."
# all your code minus self.running
manager = Manager()
signal.signal(signal.SIGHUP, lambda signum, frame: manager.stop())
manager.start()
一个警告,是signal.pause()将取消暂停任何信号,因此您可能需要相应地更改您的代码。
修改强>
以下对我来说很合适:
import daemon
import signal
import time
class Manager:
"""
This manager starts the http server processes and worker
processes, creates the input/output queues that keep the processes
work together nicely.
"""
def __init__(self):
self.NUMBER_OF_PROCESSES = 5
def start(self):
# all your code minus the loop
print "waiting to die"
self.running = 1
while self.running:
time.sleep(1)
print "quit"
def stop(self):
print "quitting ..."
# all your code minus self.running
self.running = 0
manager = Manager()
context = daemon.DaemonContext()
context.signal_map = {signal.SIGHUP : lambda signum, frame: manager.stop()}
context.open()
manager.start()
你使用的是什么版本的python?
答案 1 :(得分:1)
您创建了http服务器进程,但没有join()
它。如果,而不是执行os.kill()
来停止http服务器进程,您发送一个停止处理的标记(None
,就像您发送给工作人员),然后执行{{1} }?
更新:您还需要为每个工作人员将self.http.join()
个哨兵发送到输入队列。你可以尝试:
None
N.B。您需要两个循环的原因是,如果您将 for w in self.workers:
self.i_queue.put(None)
for w in self.workers:
w.join()
放入与None
相同的循环中的队列中,join()
可能会被None
以外的工作人员选中1}},因此加入w
将导致调用者阻止。
您没有显示工作人员或http服务器的代码,因此我认为这些代码在调用task_done等方面表现良好,并且每个工作人员在看到w
时都会退出,而不会None
- 输入队列中的更多内容。
另请注意,get()
至少有one open, hard-to-reproduce issue,可能会咬你。