芹菜工人与经纪人断开连接

时间:2019-10-12 16:39:50

标签: python networking rabbitmq celery amqp

我正在将PythonRabbitMQCelery结合使用,以将任务分配给工作人员。每个任务大约需要15分钟,并且受CPU限制为99%。我的系统具有24核,每当我的工作人员执行此任务时,我都会收到与代理的连接错误。

[2019-10-12 08:49:57,695: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
[...]
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

我发现了其他一些与此问题相关的帖子,但都没有解决此问题。尤其是在CPU负担沉重的情况下,我知道如何解决该问题吗?

  

Windows 10(工作者)

     

macOS 10.14(RabbitMQ服务器)

     

Python 3.7

     

芹菜 4.3.0 (大黄)

     

RabbitMQ 3.7.16 (Erlang 22.0.7

我的配置只允许工作程序一次仅消耗 1个任务,即使工作进程在每次作业后都重新启动,仍然没有运气:

CELERYD_MAX_TASKS_PER_CHILD = 1,
CELERYD_CONCURRENCY = 1,
CELERY_TASK_RESULT_EXPIRES=3600,
CELERYD_PREFETCH_MULTIPLIER = 1,
CELERY_MAX_CACHED_RESULTS = 1,
CELERY_ACKS_LATE = True,

这是整个调用栈:

[2019-10-12 08:49:57,695: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\celery\worker\consumer\consumer.py", line 318, in start
    blueprint.start(self)
File "C:\Python37\lib\site-packages\celery\bootsteps.py", line 119, in start
    step.start(parent)
File "C:\Python37\lib\site-packages\celery\worker\consumer\consumer.py", line 596, in start
    c.loop(*c.loop_args())
File "C:\Python37\lib\site-packages\celery\worker\loops.py", line 118, in synloop
    qos.update()
File "C:\Python37\lib\site-packages\kombu\common.py", line 442, in update
    return self.set(self.value)
File "C:\Python37\lib\site-packages\kombu\common.py", line 435, in set
    self.callback(prefetch_count=new_value)
File "C:\Python37\lib\site-packages\celery\worker\consumer\tasks.py", line 47, in set_prefetch_count
    apply_global=qos_global,
File "C:\Python37\lib\site-packages\kombu\messaging.py", line 558, in qos
    apply_global)
File "C:\Python37\lib\site-packages\amqp\channel.py", line 1853, in basic_qos
    wait=spec.Basic.QosOk,
File "C:\Python37\lib\site-packages\amqp\abstract_channel.py", line 68, in send_method
    return self.wait(wait, returns_tuple=returns_tuple)
File "C:\Python37\lib\site-packages\amqp\abstract_channel.py", line 88, in wait
    self.connection.drain_events(timeout=timeout)
File "C:\Python37\lib\site-packages\amqp\connection.py", line 504, in drain_events
    while not self.blocking_read(timeout):
File "C:\Python37\lib\site-packages\amqp\connection.py", line 509, in blocking_read
    frame = self.transport.read_frame()
File "C:\Python37\lib\site-packages\amqp\transport.py", line 252, in read_frame
    frame_header = read(7, True)
File "C:\Python37\lib\site-packages\amqp\transport.py", line 438, in _read
    s = recv(n - len(rbuf))
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

1 个答案:

答案 0 :(得分:1)

我找到了解决此问题的方法。我觉得问题出在芹菜后端。就我而言,我正在使用redis。

以下是我的配置

Broker - rabbitmq
Backend - redis
Python - 3.7
OS - Windows 10

在celery客户端,我尝试每隔60秒从客户端ping一次worker的celery状态。在这种情况下,我没有遇到连接重置问题。

while not doors_res.ready():
    sleep(60)
result = app.get()

应用是celery实例。

在芹菜工人方面

celery worker -A <celery_file_name> -l info -P gevent

我的任务运行了大约2个小时,我没有遇到连接重置错误。