TL; DR:
我正在通过celeryd
经纪人使用beanstalkd
(例如task.delay
而不是myNotifyTask.delay()
)向myNotifyTask()
排队简短而简单的任务。尽管延迟值应该是立即的,但任务需要大约一个小时才能执行(当它们应该只需几秒钟时)。
根据我的观察,似乎确实在beanstalkd
中收到了任务,但在很长一段时间内都处于ready
状态。尽管设置CELERYD_CONCURRENCY = 8
,但仍会发生这种情况。查看beanstalkd
的日志时,我收到有关read(): Connection reset by peer
的错误,但最终会执行任务。
为什么会发生这种情况的任何想法?
详情如下。
使用beanstalk版本1.4.6,celery 3.0.20。
beanstalk日志条目如下所示:
/usr/bin/beanstalkd: prot.c:709 in check_err: read(): Connection reset by peer
尝试使用celery
诊断问题时:
> celery -b "beanstalk://beanstalk_server:11300" status
Error: No nodes replied within time constraint.
通过beanstalkd
连接telnet
时,我看到current-jobs-ready: 343
,表示作业停留在ready
状态(不是delayed
)。这是完整的输出:
> telnet localhost 11300
stats
OK 850
---
current-jobs-urgent: 343
current-jobs-ready: 343
current-jobs-reserved: 0
current-jobs-delayed: 0
current-jobs-buried: 0
cmd-put: 2484
cmd-peek: 0
cmd-peek-ready: 7
cmd-peek-delayed: 1
cmd-peek-buried: 1
cmd-reserve: 0
cmd-reserve-with-timeout: 52941
cmd-delete: 2141
cmd-release: 0
cmd-use: 2485
cmd-watch: 42
cmd-ignore: 40
cmd-bury: 0
cmd-kick: 0
cmd-touch: 0
cmd-stats: 497655
cmd-stats-job: 2141
cmd-stats-tube: 3
cmd-list-tubes: 2
cmd-list-tube-used: 1
cmd-list-tubes-watched: 52954
cmd-pause-tube: 0
job-timeouts: 0
total-jobs: 2484
max-job-size: 65535
current-tubes: 3
current-connections: 6
current-producers: 2
current-workers: 2
current-waiting: 1
total-connections: 502958
pid: 989
version: 1.4.6
rusage-utime: 45.778861
rusage-stime: 56.595537
uptime: 2489047
binlog-oldest-index: 0
binlog-current-index: 0
binlog-max-size: 10485760
不久之后:
stats-tube celery
OK 257
---
name: celery
current-jobs-urgent: 348
current-jobs-ready: 348
current-jobs-reserved: 0
current-jobs-delayed: 0
current-jobs-buried: 0
total-jobs: 2739
current-using: 3
current-watching: 1
current-waiting: 0
cmd-pause-tube: 0
pause: 0
pause-time-left: 0
答案 0 :(得分:0)
原来问题是,一个芹菜任务有一个很长的超时,导致其工人等待很长时间。即使启用了并发性,超时也只是太长了,并且任务仍然堆积在beanstalk中(没有芹菜消耗它们,因为所有工作者最终都忙着)。