Flask中的简单SQLAlchemy查询执行速度非常慢

时间:2018-05-03 06:47:31

标签: python database postgresql flask sqlalchemy

我有一张这样的表:

                             Table "public.transactions"
       Column        |           Type           | Nullable | Default | Storage  |
---------------------+--------------------------+----------+--------------------|
 id                  | integer                  | not null | nextval | plain    |
 ticket              | integer                  |          |         | plain    |
 pay_station         | character varying(50)    |          | extended|          |
 stall               | character varying(50)    |          |         | extended |
 license_plate       | character varying(8)     |          |         | extended |
 purchased_date      | timestamp with time zone | not null |         | plain    |
 expiry_date         | timestamp with time zone |          |         | plain    |
 payment_type        | character varying(50)    |          |         | extended |
 total_collections   | numeric(10,2)            |          |         | main     |
 revenue             | numeric(10,2)            |          |         | main     |
 rate_name           | character varying(50)    |          |         | extended |
 hours_paid          | numeric(4,2)             |          |         | main     |
 validation_revenue  | numeric(10,2)            |          |         | main     |
 transaction_fee     | numeric(10,2)            |          |         | main     |
 method              | character varying(50)    |          |         | extended |
Indexes:
    "transactions_pkey" PRIMARY KEY, btree (id)
    "transactions_expiry_date_idx" btree (expiry_date)
    "transactions_purchased_date_idx" btree (purchased_date)
    "transactions_stall_idx" btree (stall)

为简洁起见,我省略了20多列。

此表有大约250万行。

现在我在Flask中提供API的Python代码对于示例查询看起来像这样:

filters = [
    datetime_range['start'] < Transactions.expiry_date,
    datetime_range['end'] > Transactions.purchased_date
]

if 'parking_spaces' in params:
    spaces = params['parking_spaces'] # array
    filters.append(Transactions.stall.in_(spaces))

results = Transactions.query.with_entities(
    Transactions.stall, Transactions.purchased_date, Transactions.expiry_date
    ).filter(*filters).order_by(Transactions.purchased_date).all()

日期为datetime个对象。现在,如果我没有在POST正文中提供任何输入,我默认为最小/最大时间,没有WHERE IN空格,查询如下所示:

datetime_range: 
{'start': datetime.datetime(2016, 1, 1, 0, 0, tzinfo=datetime.timezone.utc), 'end': datetime.datetime(2019, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)}

Query: 
SELECT transactions.stall AS transactions_stall, transactions.purchased_date AS transactions_purchased_date, transactions.expiry_date AS transactions_expiry_date 
FROM transactions 
WHERE transactions.expiry_date > %(expiry_date_1)s AND transactions.purchased_date < %(purchased_date_1)s ORDER BY transactions.purchased_date

现在,如果我直接在psql中执行查询:

SELECT transactions.stall, transactions.purchased_date, transactions.expiry_date FROM transactions WHERE transactions.expiry_date > '2016-01-01 00:00:00.000Z' AND transactions.purchased_date < '2019-01-01 00:00:00.000Z';

时间:2724.326 ms(00:02.724)

但是,通过Flask在SQLAlchemy中执行相同的查询,使用Postman进行测试,我得到 866946 ms(14:26.946)的响应,返回13.4KB的数据。

这显然是一个巨大的差异。当我调整范围时,响应时间呈指数增长 - 一些样本:

{
    "datetime_range": {
        "start": "2017-01-01T14:30:00.000Z",
        "end": "2017-01-04T18:00:00.000Z"
    }
}

响应时间:13387毫秒(00:13.39)

psql中的相同查询:

SELECT transactions.stall, transactions.purchased_date, transactions.expiry_date FROM transactions WHERE transactions.expiry_date > '2017-01-01T14:30:00.000Z' AND transactions.purchased_date < '2017-01-04T18:00:00.000Z';

时间:580.603毫秒

{
    "datetime_range": {
        "start": "2017-01-01T14:30:00.000Z",
        "end": "2017-03-01T18:00:00.000Z"
    }
}

SQLAlchemy响应时间:41878毫秒

SELECT transactions.stall, transactions.purchased_date, transactions.expiry_date FROM transactions WHERE transactions.expiry_date > '2017-01-01T14:30:00.000Z' AND transactions.purchased_date < '2017-03-01T18:00:00.000Z';

PostgreSQL响应时间:1170.169 ms(00:01.170)

为什么这里有如此巨大的差异,我怎样才能让SQLAlchemy更快地执行并保持我的Flask响应时间大约为秒,而不是几分钟?

0 个答案:

没有答案