运行Django 1.4.10,Celery 3.1.7,Python 2.7,Kombu 3.0.12,并使用django supervisor运行芹菜工作者和芹菜作为守护进程。使用IronMQ作为经纪人。
一切都能很好地处理周期性任务好几天,然后突然间工人会死于致命:
[2014-08-04 16:52:17,647: ERROR/MainProcess] Unrecoverable error: TypeError("sequence index must be integer, not 'unicode'",)
Traceback (most recent call last):
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker /__init__.py", line 206, in start
self.blueprint.start(self)
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 373, in start
return self.obj.start()
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/consumer.py", line 270, in start
blueprint.start(self)
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/consumer.py", line 786, in start
c.loop(*c.loop_args())
File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/loops.py", line 99, in synloop
connection.drain_events(timeout=2.0)
File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/connection.py", line 279, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 844, in drain_events
self._callbacks[queue](message)
File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 529, in _callback
message = self.Message(self, raw_message)
File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 242, in __init__
properties = payload['properties']
TypeError: sequence index must be integer, not 'unicode'
查看时间戳,在任务完成后,它似乎总是崩溃。因为我有我的supervisor.conf文件自动重启,然后它进入循环尝试重新启动工作程序并立即失败。但是,如果我使用supervisor关闭所有进程并重新启动,那么它将再次运行几天。
Supervisor.conf:
[program:celeryd]
command={{ PYTHON }} {{ PROJECT_DIR }}/manage.py celery worker -l info
numprocs=1
stdout_logfile={{ PROJECT_DIR }}/worker.log
stderr_logfile={{ PROJECT_DIR }}/worker.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs=600
killasgroup=true
priority=998
[program:celerybeat]
command={{ PYTHON }} {{ PROJECT_DIR }}/manage.py celery beat -l INFO
numprocs=1
stdout_logfile={{ PROJECT_DIR }}/beat.log
stderr_logfile={{ PROJECT_DIR }}/beat.log
autostart=true
autorestart=true
startsecs=10
priority=999
不知道在哪里开始调试这个 - 把它扔到那里,因为其他人有同样的问题并且可以对它进行一些启发。
更新:我切换回可靠的Erlang / Rabbit,问题完全消失了。所以这是IronMQ。我无法从他们那里得到任何帮助,所以他们失去了一个顾客!