我正在尝试尽可能快地处理整个csv文件,所以我希望并行处理每一行作为芹菜任务。清理也是一个芹菜任务,必须等到每一行都被处理完毕。请参阅下面的示例。
问题是,我似乎无法通过文件,因为我一直遇到MySQL的连接错误。到目前为止,我已经看到了这两个错误:SELECT f.fizz_name, MAX(b.buzz_version), fu.foo_name
FROM fizz f
INNER JOIN buzz b
ON f.fizz_id = b.fizz_id
INNER JOIN foo fu
ON b.buzz_id = fu.buzz_id
WHERE f.bar LIKE 'YES'
GROUP BY b.buzz_version
和2013, 'Lost connection to MySQL server during query'
2006, 'MySQL server has gone away'
我正在使用芹菜3.1.18和SQLAlchemy 0.9.9。我也在使用连接池。
from app.db.meta import Session
from celery import chord, Celery
from celery.signals import task_postrun
celery = Celery()
celery.config_from_object('config')
@task_postrun.connect
def close_session(*args, **kwargs):
Session.remove()
def main():
# process each line in parallel
header = [process_line.s(line) for line in csv_file]
# pass stats to cleanup after all lines are processed
callback = cleanup.s()
chord(header)(callback)
@celery.task
def process_line(line):
session = Session()
...
# process line
...
return stats
@celery.task
def cleanup(stats):
session = Session()
...
# do cleanup and log stats
...
答案 0 :(得分:0)
Read the answer。简而言之,您必须禁用SQLAlchemy's Pool engine或尝试ping mysql服务器:
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy import event, exc
def instance(app):
""":rtype: SQLAlchemy"""
db = SQLAlchemy(app)
if app.testing:
return db
@event.listens_for(db.engine, 'checkout')
def checkout(dbapi_con, con_record, con_proxy):
try:
try:
dbapi_con.ping(False)
except TypeError:
app.logger.debug('MySQL connection died. Restoring...')
dbapi_con.ping()
except dbapi_con.OperationalError as e:
app.logger.warning(e)
if e.args[0] in (2006, 2013, 2014, 2045, 2055):
raise exc.DisconnectionError()
else:
raise
return db