使用RabbitMQ和pika(python),我正在运行一个作业排队系统,它为节点(异步消费者)提供任务。定义任务的每条消息只有在任务完成后才会被确认。
有时我需要在这些节点上执行更新,并且我创建了一个退出模式,其中节点等待其任务完成,然后正常退出。然后我可以进行维护工作。
为了节点在这个退出模式下没有从RabbitMQ获得更多消息,我让它在等待作业完成之前调用basic_cancel方法。
这种方法的效果在pika文档中有描述:
This method cancels a consumer. This does not affect already
delivered messages, but it does mean the server will not send any more
messages for that consumer. The client may receive an arbitrary number
of messages in between sending the cancel method and receiving the
cancel-ok reply. It may also be sent from the server to the client in
the event of the consumer being unexpectedly cancelled (i.e. cancelled
for any reason other than the server receiving the corresponding
basic.cancel from the client). This allows clients to be notified of
the loss of consumers due to events such as queue deletion.
因此,如果您已经收到“已传递的消息”作为已收到的消息,但未必确认,则退出模式允许等待的任务不应该被重新排队,即使运行它的消费者节点将自己取消排队系统
我的异步使用者类的停止功能代码(取自pika示例)与此类似:
def stop(self):
"""Cleanly shutdown the connection to RabbitMQ by stopping the consumer
with RabbitMQ. When RabbitMQ confirms the cancellation, on_cancelok
will be invoked by pika, which will then closing the channel and
connection. The IOLoop is started again because this method is invoked
when CTRL-C is pressed raising a KeyboardInterrupt exception. This
exception stops the IOLoop which needs to be running for pika to
communicate with RabbitMQ. All of the commands issued prior to starting
the IOLoop will be buffered but not processed.
"""
LOGGER.info('Stopping')
self._closing = True
self.stop_consuming()
LOGGER.info('Waiting for all running jobs to complete')
for index, thread in enumerate(self.threads):
if thread.is_alive():
thread.join()
# also tried with a while loop that waits 10s as long as the
# thread is still alive
LOGGER.info('Thread {} has finished'.format(index))
# also tried moving the call to stop consuming up to this point
if self._connection!=None:
self._connection.ioloop.start()
LOGGER.info('Closing connection')
self.close_connection()
我的问题是,在消费者取消后,异步消费者似乎不再发送心跳,即使我在等待我的任务(线程)完成的循环之后执行取消。
我已经读过有关BlockingConnections的process_data_events函数,但我找不到这样的函数。 SelectConnection类的ioloop是否等同于异步使用者?
由于退出模式下的节点不再发送心跳,一旦达到最大心跳,RabbitMQ将重新执行当前正在执行的任务。我想保持这种心跳不受影响,因为当我不处于退出模式时,这不是问题(我的心跳大约是100s,我的任务可能需要2个小时才能完成)。
看看RabbitMQ日志,心跳确实是原因:
=ERROR REPORT==== 12-Apr-2017::19:24:23 ===
closing AMQP connection (.....) :
missed heartbeats from client, timeout: 100s
我能想到的唯一解决方法是确认在退出模式下仍然运行的任务对应的消息,并希望这些任务不会失败......
我可以使用通道或连接中的任何方法在等待时手动发送一些心跳吗?
问题可能是time.sleep()或thread.join()方法(来自python线程包)充当阻塞,并且不允许其他一些线程执行他们需要的东西吗?我在其他应用程序中使用它们似乎并不像这样。
由于这个问题仅在退出模式下出现,我猜停止功能中有一些东西导致消费者停止发送心跳,但是我也尝试过(没有任何成功)只在调用stop_consuming方法之后wait-on-running-tasks循环,我看不出这个问题的根源是什么。
非常感谢你的帮助!
答案 0 :(得分:0)
结果是stop_consuming函数以异步方式调用basic_cancel并在channel.close()函数上进行回调,导致我的应用程序停止其RabbitMQ交互并且RabbitMQ重新排队unackesdmessages。实际上已经意识到,当线程试图稍后确认其余任务时出现错误,因为通道现在设置为None,因此不再有ack方法。
希望它有所帮助!