Flask,SQLAlchemy和流式响应时的高内存使用率

时间:2015-08-10 21:18:27

标签: python flask sqlalchemy

我正在编写一个页面,用于从我的数据库中下载数百万条记录。我计划在一个内存有限的环境中运行它。因此,我想要流式传输CSV数据。此代码由于某种原因仍然使用大量内存,并且在完成下载后内存不会被释放。造成这种泄漏的原因。我的应用程序从占用30mb内存到2GB

@app.route('/admin/querydb', methods=['GET', 'POST'])
@admin_filter
def admin_query_db():

    if request.method == 'POST':
        query = model.DriverStop.query.join(model.DriverDailyRoute, model.Agency).join(model.User)

        if 'date_filter_start' in request.form:
            start = datetime.datetime.strptime(request.form['start_date'], '%Y-%m-%d')
            start -= datetime.timedelta(days=1)
            query = query.filter(model.DriverDailyRoute.date >= start)

        if 'date_filter_end' in request.form:
            end = datetime.datetime.strptime(request.form['end_date'], '%Y-%m-%d')
            query = query.filter(model.DriverDailyRoute.date < end)

        if not 'recipient' in request.form:
            query = query.filter(model.Agency.agency_type != model.Agency.RECIPIENT)

        if not 'donor' in request.form:
            query = query.filter(model.Agency.agency_type != model.Agency.DONOR)


        header = ['Username', 'Last Name', 'First Name', 'Agency Name', 
                  'Agency Type', 'City', 'Date', 'Time', 'Is Special Stop', 'Cargo Temperature',
                  'Prepared', 'Produce', 'Dairy', 'Raw Meat', 'Perishable', 'Dry Goods',
                  'Bread', 'Total']

        def csv_line(items):
            return ''.join(['"' + str(s).replace('"', '""') + '",' for s in items][:-1])

        def gen_csv():
            yield csv_line(header) + '\n'
            for q in query.all():
                yield csv_line([q.route.driver.username, q.route.driver.last_name, q.route.driver.first_name,
                         q.agency.name, q.agency.agency_type, q.agency.city, q.route.date, 
                         q.time, q.special_stop, q.cargo_temp, q.prepared, q.produce,
                         q.dairy, q.raw_meat, q.perishable, q.dry_goods, q.bread, q.total_up()]) + '\n'

        return Response(gen_csv(), mimetype='text/csv')

    drivers = model.User.query.filter_by(acct_type=model.User.DRIVER).order_by(model.User.active, model.User.last_name, model.User.first_name, model.User.username).all()
    agencies = model.Agency.query.order_by(model.Agency.active, model.Agency.name).all()
    return render_template('admin/dbquery.html', page_title='Database Query', drivers=drivers, agencies=agencies)

我的一些其他页面也有这种行为,在大型查询后它们不会释放内存。

1 个答案:

答案 0 :(得分:3)

调用Query.all()会导致SQLAlchemy从数据库查询中加载所有结果并将其转换为内存中的列表。您应该使用Query.yield_per()批量加载数据。