并发celery任务执行并存储结果,但.get不起作用

时间:2018-09-25 20:37:34

标签: django rabbitmq celery python-3.6 django-celery

我已经写了一个Celery Task类:

myapp.tasks.py

from __future__ import absolute_import, unicode_literals
from .services.celery import app
from .services.command_service import CommandService
from exceptions.exceptions import *
from .models import Command


class CustomTask(app.Task):

    def run(self, json_string, method_name, cmd_id: int):
        command_obj = Command.objects.get(id=cmd_id)  # type: Command
        try:
            val = eval('CommandService.{}(json_string={})'.format(method_name, json_string))
            status, error = 200, None
        except Exception as e:
            auto_retry = command_obj.auto_retry
            if auto_retry and isinstance(e, CustomError):
                command_obj.retry_count += 1
                command_obj.save()
                return self.retry(countdown=CustomTask._backoff(command_obj.retry_count), exc=e)
            elif auto_retry and isinstance(e, AnotherCustomError) and command_obj.retry_count == 0:
                command_obj.retry_count += 1
                command_obj.save()
                print("RETRYING NOW FOR DEVICE CONNECTION ERROR. TRANSACTION: {} || IP: {}".format(command_obj.transaction_id,
                                                                                                command_obj.device_ip))
                return self.retry(countdown=command_obj.retry_count*2, exc=e)
            val = None
            status, error = self._find_status_code(e)

        return_dict = {"error": error, "status_code": status, "result": val}
        return return_dict

    @staticmethod
    def _backoff(attempts):
        return 2 ** attempts

    @staticmethod
    def _find_status_code(exception):
        if isinstance(exception, APIException):
            detail = exception.default_detail if exception.detail is None else exception.detail
            return exception.status_code, detail

        return 500, CustomTask._get_generic_exc_msg(exception)

    @staticmethod
    def _get_generic_exc_msg(exc: Exception):
        s = ""
        try:
            for msg in exc.args:
                s += msg + ". "
        except Exception:
            s = str(exc)
        return s


CustomTask = app.register_task(CustomTask())

Celery App定义:

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery, Task
from django.conf import settings

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')

_celery_broker = settings.CELERY_BROKER  <-- my broker is amqp://username:password@localhost:5672/myhost
app = Celery('myapp', broker=_celery_broker, backend='rpc://', include=['myapp.tasks', 'myapp.controllers'])
app.config_from_object('django.conf:settings', namespace='CELERY')

app.autodiscover_tasks(['myapp'])
app.conf.update(
    result_expires=4800,
    task_acks_late=True
)

我的 init .py本教程推荐:

from .celery import app as celery_app

__all__ = ['celery_app']

正在运行任务的控制器:

from __future__ import absolute_import, unicode_literals
from .services.log_service import LogRunner
from myapp.services.command_service import CommandService
from exceptions.exceptions import *
from myapp.services.celery import app
from myapp.services.tasks import MyTask
from .models import Command

class MyController:
    def my_method(self, json_string):
        <non-async set up stuff here>

        cmd_obj = Command.objects.create(<stuff>)  # type: Command
        task_exec = MyTask.delay(json_string, MyController._method_name, cmd_obj.id)
        cmd_obj.task_id = task_exec
        try:
            return_dict = task_exec.get()
        except Exception as e:
            self._logger.error("ERROR: IP: {} and transaction: {}. Error Type: {}, "
                            "Celery Error: {}".format(ip_addr, transaction_id, type(e), e))
            status_code, error = self._find_status_code(e)
            return_dict = {"error": error, "status_code": status_code, "result": None}
        return return_dict

**这是我的问题:**

当我通过一个请求打一个视图,一个接一个地运行该Django控制器时,它可以正常运行。 完美

但是,我要访问的外部服务将为2个并发请求抛出一个错误(这是可以预期的-可以)。收到错误后,我将自动重试任务。

这是奇怪的部分 重试后,我在控制器中的.get()对于所有并发请求均停止工作。我的控制器就挂在那里!而且我知道芹菜实际上正在执行任务!这是芹菜运行的日志:

[2018-09-25 19:10:24,932: INFO/MainProcess] Received task: myapp.tasks.MyTask[bafd62b6-7e29-4c39-86ff-fe903d864c4f]  
[2018-09-25 19:10:25,710: INFO/MainProcess] Received task: myapp.tasks.MyTask[8d3b4279-0b7e-48cf-b45d-0f1f89e213d4]  <-- THIS WILL FAIL BUT THAT IS OK
[2018-09-25 19:10:25,794: ERROR/ForkPoolWorker-1] Could not connect to device with IP <some ip> at all. Retry Later plase
[2018-09-25 19:10:25,798: WARNING/ForkPoolWorker-1] RETRYING NOW FOR DEVICE CONNECTION ERROR. TRANSACTION: b_txn || IP: <some ip>
[2018-09-25 19:10:25,821: INFO/MainProcess] Received task: myapp.tasks.MyTask[8d3b4279-0b7e-48cf-b45d-0f1f89e213d4]  ETA:[2018-09-25 19:10:27.799473+00:00] 
[2018-09-25 19:10:25,823: INFO/ForkPoolWorker-1] Task myapp.tasks.MyTask[8d3b4279-0b7e-48cf-b45d-0f1f89e213d4] retry: Retry in 2s: AnotherCustomError('Could not connect to IP <some ip> at all.',)
[2018-09-25 19:10:27,400: INFO/ForkPoolWorker-2] executed command some command at IP <some ip> 
[2018-09-25 19:10:27,418: INFO/ForkPoolWorker-2] Task myapp.tasks.MyTask[bafd62b6-7e29-4c39-86ff-fe903d864c4f] succeeded in 2.4829552830196917s: {'error': None, 'status_code': 200, 'result': True}
<some command output here from a successful run>  **<-- belongs to task bafd62b6-7e29-4c39-86ff-fe903d864c4f**

[2018-09-25 19:10:31,058: INFO/ForkPoolWorker-2] executed some command at  IP <some ip> 
[2018-09-25 19:10:31,059: INFO/ForkPoolWorker-2] Task command_runner.tasks.MyTask[8d3b4279-0b7e-48cf-b45d-0f1f89e213d4] succeeded in 2.404364461021032s: {'error': None, 'status_code': 200, 'result': True}
<some command output here from a successful run> **<-- belongs to task 8d3b4279-0b7e-48cf-b45d-0f1f89e213d4 which errored and retried itself**

如您所见,该任务确实在芹菜上运行!!只是我在控制器中拥有的.get()无法备份这些结果-无论成功完成的任务如何或错误的任务。

通常,运行并发请求Error: "Received 0x50 while expecting 0xce"时出现错误。 那是什么? 再次,奇怪的是,当一个接一个的请求而没有Django处理多个传入请求时,所有这些工作正常。虽然,我无法重试单个请求。

1 个答案:

答案 0 :(得分:0)

如果使用more than once or after a celery restart,则RPC后端(get所等待的)被设计为失败。

  

结果只能被检索一次,并且只能由发起任务的客户端检索。两种不同的过程不能等待相同的结果。

     

默认情况下,消息是瞬态的(非持久性),因此,如果代理重新启动,结果将消失。您可以使用result_persistent设置将结果后端配置为发送持久性消息。

所以看起来正在发生的事情是该异常导致celery停止并中断了与调用控制器的rpc连接。根据您的用例,使用永久结果后端(例如redis或数据库)可能更有意义。