当我尝试重试失败的任务时,我间歇性地(大约20%的时间)从Celery获得IOError异常。
这是我的任务:
@task
def update_data(pk_id):
try:
pk = PK.objects.get(pk=pk_id)
results = pk.get_update()
return results
except urllib2.HTTPError, exc:
print "Let's retry in a few minutes."
update_data.retry(exc=exc, countdown=600)
例外:
[2011-10-07 11:35:53,594: ERROR/MainProcess] Task report.tasks.update_data[1babd4e3-45eb-4fa3-a497-68b67bb4a6df] raised exception: IOError()
Traceback (most recent call last):
File "/home/prj/prj_env/lib/python2.6/site-packages/celery/execute/trace.py", line 36, in trace
return cls(states.SUCCESS, retval=fun(*args, **kwargs))
File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/task/__init__.py", line 232, in __call__
return self.run(*args, **kwargs)
File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/__init__.py", line 172, in run
return fun(*args, **kwargs)
File "/home/prj/prj/report/tasks.py", line 109, in update_data
update_data.retry(exc=exc, countdown=600)
File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/task/__init__.py", line 520, in retry
self.name, options["task_id"], args, kwargs))
HTTPError
RabbitMQ Logs
=INFO REPORT==== 7-Oct-2011::15:35:43 ===
closing TCP connection <0.4294.17> from 10.254.122.225:59704
=WARNING REPORT==== 7-Oct-2011::15:35:43 ===
exception on TCP connection <0.4330.17> from 10.254.122.225:59715
connection_closed_abruptly
=INFO REPORT==== 7-Oct-2011::15:35:43 ===
closing TCP connection <0.4330.17> from 10.254.122.225:59715
=WARNING REPORT==== 7-Oct-2011::15:35:49 ===
exception on TCP connection <0.4313.17> from 10.254.122.225:59709
connection_closed_abruptly
=INFO REPORT==== 7-Oct-2011::15:35:49 ===
closing TCP connection <0.4313.17> from 10.254.122.225:59709
=WARNING REPORT==== 7-Oct-2011::15:35:49 ===
exception on TCP connection <0.4350.17> from 10.254.122.225:59720
connection_closed_abruptly
=INFO REPORT==== 7-Oct-2011::15:35:49 ===
closing TCP connection <0.4350.17> from 10.254.122.225:59720
=INFO REPORT==== 7-Oct-2011::15:36:22 ===
accepted TCP connection on [::]:5672 from 10.255.199.63:50526
=INFO REPORT==== 7-Oct-2011::15:36:22 ===
starting TCP connection <0.4501.17> from 10.255.199.63:50526
为什么会发生这种情况的任何想法?
谢谢!
答案 0 :(得分:0)
可以保存数据库中的每个任务,如果没有结果则重试它们 收到一段时间了?或者可能是调度员拥有它自己的持久性 存储?那么如果工作线程崩溃接收任务或者 执行时呢?
答案 1 :(得分:0)
max_retries默认为3,因此如果同一任务连续3次失败(即20%的时间),则重试将重新抛出异常。