Heroku H12错误下载文件破折号应用程序

时间:2019-12-02 14:48:48

标签: heroku timeout celery plotly-dash

我有一个Dash应用程序,该应用程序一次可以收集1个月的天气数据(第三方允许的数据),然后将数据汇总在一起,以便用户下载。当我在Heroku Local上测试该应用程序时,一切正常,但是当我在Heroku上部署该应用程序时,下载过程超过30秒后,我收到了H12错误。我正在使用Celery和Redis进行后台任务和工作。我的理解是,有了后台工作人员,我可以超过30秒超时。


  • 从第三方分块下载数据的过程要花费30多秒钟的时间,这一过程无法缩短,但可以分块进行。
  • 当前,我将数据保存到tmp文件夹中,并且用户从磁盘下载。我知道Heroku是临时的,但我不确定S3或其他存储系统是否有意义。我不需要保留文件,只需将其保留足够长的时间即可将其从磁盘发送给用户。
    • Celery任务是在Dash回调中触发的,因此在任务运行时与Web Worker之间存在一些连接。



import celery
import pandas as pd
import os

celery_app = celery.Celery('download')
celery_app.conf.update(BROKER_URL=os.environ['REDIS_URL'], CELERY_RESULT_BACKEND=os.environ['REDIS_URL'])

def download_remote_data(station_id, start_year, start_month, end_year, end_month, url_raw, relative_filename):

    # In this test case the download dates are defined in app.py and not dynamically by the user
    download_dates = pd.date_range(start=start_year + '/' + start_month,
                                   end=end_year + '/' + end_month, freq='M')

    # bulk data url paths
    urls = [url_raw.format(station_id, date.year, date.month, 1) for date in download_dates]

    # pandas magic
    results = pd.concat((pd.read_csv(url) for url in urls))

    # Store file to temporary folder as csv
    absolute_filename = os.path.join(os.getcwd(), relative_filename)
    results.to_csv(absolute_filename, index=False)

    return results.to_dict()


# Environment Canada Bulk Data Download Path
bulk_data_pathname = 'https://climate.weather.gc.ca/climate_data/bulk_data_e.html?' \
# Callback to download data
    Output(component_id='download-link', component_property='href'),
    [Input(component_id='generate-btn', component_property='n_clicks')]
def update_output_div(clicked):
    ctx = dash.callback_context  # Look for specific click event

    if clicked and ctx.triggered[0]['prop_id'] == 'generate-btn.n_clicks':
        # store downloaded csv file in tmp
        relative_filename = os.path.join('tmp', 'downloaded.csv')
        # Use fixed dates for testing, reality is user will set dates
        data = tasks.download_remote_data.apply_async(['348', '2000', '1', '2010', '1', bulk_data_pathname,
        link_path = '/{}'.format(relative_filename)
        link_path = ''
    return link_path

# Flask Magik
def serve_static(path):
    root_dir = os.getcwd()
    return flask.send_from_directory(
        os.path.join(root_dir, 'tmp'), path

if __name__ == '__main__':
    app.run_server(debug=True, processes=4)


web: gunicorn app:server --timeout 90 -w 4 -k gevent --log-file=-
worker: celery -A tasks worker --loglevel=info

0 个答案:
