确保工作进程始终在zeroMQ中终止

时间:2011-07-26 14:03:11

标签: python zeromq

我正在使用python绑定实现一个带有zeroMQ的管道模式。

任务被扇出到工作人员,他们用这样的无限循环来监听新任务:

    while True:
        socks = dict(self.poller.poll())
        if self.receiver in socks and socks[self.receiver] == zmq.POLLIN:
            msg = self.receiver.recv_unicode(encoding='utf-8')
            self.process(msg)
        if self.hear in socks and socks[self.hear] == zmq.POLLIN:
            msg = self.hear.recv()
            print self.pid,":",  msg
            sys.exit(0)

当他们从汇聚节点收到消息时,他们退出,确认已收到预期的所有结果。

但是,工作人员可能会错过这样的消息而不能完成。什么是让工人总是完成的最好方法,当他们无法知道时(除了通过已经提到的消息,没有其他任务需要处理)。

这是我为检查工人状态而编写的测试代码:

#-*- coding:utf-8 -*-
"""
Test module containing tests for all modules of pypln 

"""
import unittest
from servers.ventilator import Ventilator
from subprocess import Popen, PIPE
import time
class testWorkerModules(unittest.TestCase):
    def setUp(self):
        self.nw = 4
        #spawn 4 workers
        self.ws = [Popen(['python', 'workers/dummy_worker.py'], stdout=None) for i in range(self.nw)]
        #spawn a sink
        self.sink = Popen(['python', 'sinks/dummy_sink.py'], stdout=None)
        #start a ventilator
        self.V = Ventilator()
        # wait for workers and sinks to connect
        time.sleep(1)

    def test_send_unicode(self):
        '''
        Pushing unicode strings through workers to sinks.
        '''

        self.V.push_load([u'são joão' for i in xrange(80)])
        time.sleep(1)
        #[p.wait() for p in self.ws]#wait for the workers to terminate
        wsr = [p.poll() for p in self.ws]
        while None in wsr:
            print wsr, [p.pid for p in self.ws if p.poll() == None] #these are the unfinished workers
            time.sleep(0.5)
            wsr = [p.poll() for p in self.ws]
        self.sink.wait()
        self.sink = self.sink.returncode
        self.assertEqual([0]*self.nw, wsr)
        self.assertEqual(0, self.sink)

if __name__ == '__main__':
    unittest.main()

1 个答案:

答案 0 :(得分:1)

所有消息传递的东西最终都会以心跳结束。如果您(作为工人或水槽或其他任何东西)发现您需要使用的组件已经死亡,您基本上可以尝试连接其他地方或自杀。因此,如果你作为工人发现水槽不再存在,那就退出吧。这也意味着即使接收器仍然存在但是连接断开,您也可以退出。但我不确定你能做得更多,也许更合理地设定所有超时......