遇到一个非常令人沮丧的错误,只要访问我的某个api端点,就会弹出这个错误。为了给出上下文,我正在使用 SQLAlchemy 的 Flask 应用程序处理应用程序,该应用程序将数据存储在 PostgreSQL 数据库集中以保存1000个连接。
用户可以通过 / timeseries 端点查询所述数据的方式之一。数据以json的形式返回,json是从查询数据库返回的ResultProxies中汇编而成的。
希望通过使用多线程,我可以使视图控制器为
由于没有正确清理会话,我已经阅读了许多其他帖子,但我觉得我已经覆盖了相同的问题。我写过的代码有什么明显的错误吗?
该应用程序使用AWS弹性beanstalk进行部署。
@classmethod
def timeseries_all(cls, table_names, agg_unit, start, end, geom=None):
"""
For each candidate dataset, query the matching timeseries and push datasets with nonempty
timeseries into a list to convert to JSON and display.
:param table_names: list of tables to generate timetables for
:param agg_unit: a unit of time to divide up the data by (day, week, month, year)
:param start: starting date to limit query
:param end: ending date to limit query
:param geom: geometric constraints of the query
:returns: timeseries list to display
"""
threads = []
timeseries_dicts = []
# set up engine for use with threading
psql_db = create_engine(DATABASE_CONN, pool_size=10, max_overflow=-1, pool_timeout=100)
scoped_sessionmaker = scoped_session(sessionmaker(bind=psql_db, autoflush=True, autocommit=True))
def fetch_timeseries(t_name):
_session = scoped_sessionmaker()
# retrieve MetaTable object to call timeseries from
table = MetaTable.get_by_dataset_name(t_name)
# retrieve ResultProxy from executing timeseries selection
rp = _session.execute(table.timeseries(agg_unit, start, end, geom))
# empty results will just have a header
if rp.rowcount > 0:
timeseries = {
'dataset_name': t_name,
'items': [],
'count': 0
}
for row in rp.fetchall():
timeseries['items'].append({'count': row.count, 'datetime': row.time_bucket.date()})
timeseries['count'] += row.count
# load to outer storage
timeseries_dicts.append(timeseries)
# clean up session
rp.close()
scoped_sessionmaker.remove()
# create a new thread for every table to query
for name in table_names:
thread = threading.Thread(target=fetch_timeseries, args=(name, ))
threads.append(thread)
# start all threads
for thread in threads:
thread.start()
# wait for all threads to finish
for thread in threads:
thread.join()
# release all connections associated with this engine
psql_db.dispose()
return timeseries_dicts
答案 0 :(得分:3)
我认为你会以一种迂回的方式解决这个问题。以下是一些关于充分利用postgres连接的建议(我在生产中使用过这种配置)。
处理大量请求的更高效方法是将Flask应用程序放在像gunicorn或uwsgi这样的wsgi服务器后面。这些服务器将能够生成应用程序的多个实例。然后当有人点击你的终端时,你的连接会在这些实例之间进行负载平衡。
因此,例如,如果你有uwsgi设置来运行5个进程,那么你就可以同时处理50个db连接(5个app x 10个池)