在Celery任务中使用Scrapy解析函数(有时可能需要10分钟)时,我得到了这个。
我用: - Django == 1.6.5 - django-celery == 3.1.16 - 芹菜== 3.1.16 - psycopg2 == 2.5.5(我也使用了psycopg2 == 2.5.4)
[2015-07-19 11:27:49,488: CRITICAL/MainProcess] Task myapp.parse_items[63fc40eb-c0d6-46f4-a64e-acce8301d29a] INTERNAL ERROR: InterfaceError('connection already closed',) Traceback (most recent call last): File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/celery/app/trace.py", line 284, in trace_task uuid, retval, SUCCESS, request=task_request, File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/celery/backends/base.py", line 248, in store_result request=request, **kwargs) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/backends/database.py", line 29, in _store_result traceback=traceback, children=self.current_task_children(request), File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 42, in _inner return fun(*args, **kwargs) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 181, in store_result 'meta': {'children': children}}) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 87, in update_or_create return get_queryset(self).update_or_create(**kwargs) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/djcelery/managers.py", line 70, in update_or_create obj, created = self.get_or_create(**kwargs) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 376, in get_or_create return self.get(**lookup), False File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 304, in get num = len(clone) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 77, in __len__ self._fetch_all() File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 857, in _fetch_all self._result_cache = list(self.iterator()) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/query.py", line 220, in iterator for row in compiler.results_iter(): File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 713, in results_iter for rows in self.execute_sql(MULTI): File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 785, in execute_sql cursor = self.connection.cursor() File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 160, in cursor cursor = self.make_debug_cursor(self._cursor()) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 134, in _cursor return self.create_cursor() File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__ six.reraise(dj_exc_type, dj_exc_value, traceback) File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/__init__.py", line 134, in _cursor return self.create_cursor() File "/home/mo/Work/python/pb-env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 137, in create_cursor cursor = self.connection.cursor() InterfaceError: connection already closed
答案 0 :(得分:12)
不幸的是这是django + psycopg2 +芹菜组合的问题。 这是一个古老而未解决的问题。
看一下这个帖子就明白了: https://github.com/celery/django-celery/issues/121
基本上,当芹菜启动一个工人时,它会分叉数据库连接 来自django.db框架。如果此连接由于某种原因而下降,那么 不会创建一个新的。芹菜与这个问题无关 一旦无法检测何时删除数据库连接 使用django.db库。 Django没有通知它何时发生, 因为它只是启动连接并且它接收到一个wsgi调用(没有 连接池)。我在巨大的制作上遇到了同样的问题 有很多机器工人的环境,有时候,这些 机器与postgres服务器失去了连接。
我解决了将每个celery主进程放在linux下的问题 supervisord处理程序和一个观察者并实现了一个装饰器 处理psycopg2.InterfaceError,当它发生这个函数时 调度系统调用以强制主管重新启动 SIGINT芹菜过程。
编辑:
找到了更好的解决方案。我实现了像这样的芹菜任务基类:
from django.db import connection
import celery
class FaultTolerantTask(celery.Task):
""" Implements after return hook to close the invalid connection.
This way, django is forced to serve a new connection for the next
task.
"""
abstract = True
def after_return(self, *args, **kwargs):
connection.close()
@celery.task(base=FaultTolerantTask)
def my_task():
# my database dependent code here
我相信它也会解决你的问题。
答案 1 :(得分:5)
伙计们和emanuelcds,
我遇到了同样的问题,现在我已经更新了我的代码并为芹菜创建了一个新的加载器:
from djcelery.loaders import DjangoLoader
from django import db
class CustomDjangoLoader(DjangoLoader):
def on_task_init(self, task_id, task):
"""Called before every task."""
for conn in db.connections.all():
conn.close_if_unusable_or_obsolete()
super(CustomDjangoLoader, self).on_task_init(task_id, task)
当然,如果您使用的是djcelery,它在设置中也需要这样的内容:
CELERY_LOADER = 'myproject.loaders.CustomDjangoLoader'
os.environ['CELERY_LOADER'] = CELERY_LOADER
我仍然要测试它,我会更新。