Question

因此，我正在创建一个监视器应用程序，它将日志记录从芹菜任务发送到ELK堆栈。

到目前为止，我已经做到了：

from project.celery import app


    def monitor(app):
        state = app.events.State()

        def on_event_success(event):
            state.event(event)

            task = state.tasks.get(event['uuid'])
            if task.name:
                    task_name = task.name
                    task_origin = task.origin
                    task_type = task.type
                    task_worker = task.worker
                    task_info = task.info()
                    task_log = "TASK NAME {}, TASK ORIGIN {}, TASK TYPE {}, TASK WORKER {}, TASK ARGS {}\n\n".format(task_name, task_origin, task_type, task_worker, task_info['args'])
                    print "SUCCESS: {}".format(task_log)

        with app.connection() as connection:
            recv = app.events.Receiver(connection, handlers={
                    'task-succeeded': on_event_success
            })
            recv.capture(limit=None, timeout=None, wakeup=True)

    if __name__ == '__main__':
        application = app
        monitor(app)

使用此代码，我可以捕获任务中几乎所有可用的信息，但是我没有设法找到捕获哪个队列生成任务执行的方法。

我有两个队列：

CELERY_QUEUES = (
    # Persistent task queue
    Queue('celery', routing_key='celery'),
    # Persistent routine task queue
    Queue('routine', routing_key='routine')
)

我想从事件创建的任务对象中获取此信息，从而知道哪个队列是我执行任务的开始。

Answer 1

为此，您需要enable the task sent event。

您还需要为task-sent事件实现处理程序，就像您对task-succeeded所做的一样。

您的监视应用程序应至少保留所有捕获的task-sent事件中的任务ID（event["uuid"]）和路由键（event["routing_key"]。我使用{{ 3}}，当我需要路由键信息时，我会从任务成功和任务失败的事件处理程序中使用此词典。

如果您想使用示例的任务名称和参数，则需要以与上述相同的方式处理task-received事件...

您可能想知道为什么我使用TTLCache-我们的Celery群集每天运行数百万个任务，将所有已发送任务的事件数据保留在内存中将很快占用所有可用内存。

最后，这是缓存任务已发送数据并在任务成功的事件处理程序中使用的代码：

from cachetools import TTLCache
from project.celery import app


def monitor(app):
    state = app.events.State()

    # keep a couple of days of history in case not acknowledged tasks are retried
    task_info = TTLCache(float('inf'), 3.2 * 24 * 60 * 60)

    def on_event_success(event):
        nonlocal task_info
        state.event(event)

        task = state.tasks.get(event['uuid'])
        if task.name:
                task_name = task.name
                task_origin = task.origin
                task_type = task.type
                task_worker = task.worker
                t_info = task.info()
                task_log = "TASK NAME {}, TASK ORIGIN {}, TASK TYPE {}, TASK WORKER {}, TASK ARGS {}".format(task_name, task_$
                print("SUCCESS: {}".format(task_log))
                if event["uuid"] in task_info:
                    cached_task = task_info[event["uuid"]]
                    if "routing_key" in cached_task:
                        print("    routing_key: {}\n\n".format(cached_task["routing_key"]))

    def on_task_sent(event):
        # task-sent(uuid, name, args, kwargs, retries, eta, expires, queue, exchange,
        # routing_key, root_id, parent_id)
        nonlocal task_info
        if event["uuid"] not in task_info:
            task_info[event["uuid"]] = {"name": event["name"],
                                        "args": event["args"],
                                        "queue": event["queue"],
                                        "routing_key": event["routing_key"]}

    with app.connection() as connection:
        recv = app.events.Receiver(connection, handlers={
                'task-succeeded': on_event_success,
                "task-sent": on_task_sent,
                "*": state.event
        })
    recv.capture(limit=None, timeout=None, wakeup=True)


if __name__ == '__main__':
    application = app
    monitor(app)

我从来没有足够的时间来研究Celery的celery.events.state.State类。我做知道它使用LRUCache来缓存一些条目，但是我不确定是否可以使用它代替我在代码中使用的TTLCache ...

如何从Celery获取发起任务执行的队列

1 个答案: