如何使这个Postgres查询更快?

时间:2019-03-20 01:42:17

标签: sql postgresql

我使用PostgreSQL。我做一个表聚合计算。 我使用核心四块。我的数据格式是这样的。

----------------------------------------------
line_no |holiday_flag| passenger_up|plan_date
----------------------------------------------
1           0           22          2019-02-02
1           0           15          2019-02-02
1           0           14          2019-02-02
1           1           0           2019-02-03
1           1           11          2019-02-03
1           1           3           2019-02-03
1           1           15          2019-02-03
1           3           4           2019-02-17
1           3           1           2019-02-17
1           3           3           2019-02-17
1           3           4           2019-02-17
1           3           2           2019-02-17
1           3           0           2019-02-17
1           3           1           2019-02-17
1           2           0           2019-02-02
1           2           0           2019-02-02
1           2           0           2019-02-02
1           2           3           2019-02-02

我想用holiday_flag来对passenger_up求和。holiday_flag的类型为0、1、2、3。holiday_flag(0,2)聚集在一组中。最终的holiday_flag类型为0、1、3。 我的最终结果是这样的。

-----------------------------------
line_no |holiday_flag| passenger_up
-----------------------------------
1           0           4858
1           1           2176
1           3           4638
-----------------------------------

我的sql现在是这样的。

SELECT
    line_no,
    holiday_flag,
    round(passenger_up / day_count) as passenger_up
FROM
    (
        SELECT
            line_no,
            holiday_flag,
            SUM (passenger_up) AS passenger_up,
            SUM (day_count) AS day_count
        FROM
            (
                SELECT
                    line_no,
                    CASE
                WHEN holiday_flag = 0
                OR holiday_flag = 2 THEN
                    0
                ELSE
                    holiday_flag
                END AS holiday_flag,
                SUM (passenger_up) AS passenger_up,
                COUNT (DISTINCT plan_date) AS day_count
            FROM
                ext_passenger_flow_tag
            WHERE
                line_no = 1
            AND plan_date >= '2019-01-31'
            AND plan_date <= '2019-02-20'
            GROUP BY
                line_no,
                holiday_flag
            ) AS B
        GROUP BY
            B.line_no,
            B.holiday_flag
    ) AS C

我的EXPLAIN输出是这样的。

Subquery Scan on c  (cost=4456.55..4456.72 rows=4 width=40) (actual time=19.316..19.322 rows=3 loops=1)
  Output: c.line_no, c.holiday_flag, round((c.passenger_up / c.day_count), 0)
  Buffers: shared hit=372
  ->  GroupAggregate  (cost=4456.55..4456.66 rows=4 width=72) (actual time=19.312..19.316 rows=3 loops=1)
        Output: b.line_no, b.holiday_flag, sum(b.passenger_up), sum(b.day_count)
        Group Key: b.line_no, b.holiday_flag
        Buffers: shared hit=372
        ->  Sort  (cost=4456.55..4456.56 rows=4 width=24) (actual time=19.305..19.305 rows=4 loops=1)
              Output: b.line_no, b.holiday_flag, b.passenger_up, b.day_count
              Sort Key: b.line_no, b.holiday_flag
              Sort Method: quicksort  Memory: 25kB
              Buffers: shared hit=372
              ->  Subquery Scan on b  (cost=4359.02..4456.51 rows=4 width=24) (actual time=17.501..19.279 rows=4 loops=1)
                    Output: b.line_no, b.holiday_flag, b.passenger_up, b.day_count
                    Buffers: shared hit=372
                    ->  GroupAggregate  (cost=4359.02..4456.47 rows=4 width=26) (actual time=17.500..19.276 rows=4 loops=1)
                          Output: ext_passenger_flow_tag.line_no, CASE WHEN ((ext_passenger_flow_tag.holiday_flag = 0) OR (ext_passenger_flow_tag.holiday_flag = 2)) THEN 0 ELSE (ext_passenger_flow_tag.holiday_flag)::integer END, sum(ext_passenger_flow_tag.passenger_up), count(DISTINCT ext_passenger_flow_tag.plan_date), ext_passenger_flow_tag.holiday_flag
                          Group Key: ext_passenger_flow_tag.line_no, ext_passenger_flow_tag.holiday_flag
                          Buffers: shared hit=372
                          ->  Sort  (cost=4359.02..4378.50 rows=7790 width=14) (actual time=15.231..16.075 rows=7899 loops=1)
                                Output: ext_passenger_flow_tag.line_no, ext_passenger_flow_tag.holiday_flag, ext_passenger_flow_tag.passenger_up, ext_passenger_flow_tag.plan_date
                                Sort Key: ext_passenger_flow_tag.holiday_flag
                                Sort Method: quicksort  Memory: 563kB
                                Buffers: shared hit=372
                                ->  Bitmap Heap Scan on public.ext_passenger_flow_tag  (cost=2174.75..3855.50 rows=7790 width=14) (actual time=9.908..12.697 rows=7899 loops=1)
                                      Output: ext_passenger_flow_tag.line_no, ext_passenger_flow_tag.holiday_flag, ext_passenger_flow_tag.passenger_up, ext_passenger_flow_tag.plan_date
                                      Recheck Cond: ((ext_passenger_flow_tag.line_no = 1) AND (ext_passenger_flow_tag.line_subno = 1) AND (ext_passenger_flow_tag.updown = 1) AND (ext_passenger_flow_tag.plan_date >= '2019-01-31'::date) AND (ext_passenger_flow_tag.plan_date <= '2019-02-20'::date) AND (ext_passenger_flow_tag.interval_flag = 2))
                                      Heap Blocks: exact=86
                                      Buffers: shared hit=372
                                      ->  Bitmap Index Scan on ext_pf_tag_index1  (cost=0.00..2172.80 rows=7790 width=0) (actual time=9.876..9.876 rows=7899 loops=1)
                                            Index Cond: ((ext_passenger_flow_tag.line_no = 1) AND (ext_passenger_flow_tag.line_subno = 1) AND (ext_passenger_flow_tag.updown = 1) AND (ext_passenger_flow_tag.plan_date >= '2019-01-31'::date) AND (ext_passenger_flow_tag.plan_date <= '2019-02-20'::date) AND (ext_passenger_flow_tag.interval_flag = 2))
                                            Buffers: shared hit=286
Planning time: 0.533 ms
Execution time: 19.450 ms

我不知道如何优化sql。 我的目标是查询sql运行不到100毫秒。

1 个答案:

答案 0 :(得分:0)

查询所用时间不到100毫秒。

首先,简化您的SQL语句:

SELECT line_no,                                                             
       CASE WHEN holiday_flag = 2                                           
            THEN 0                                                             
            ELSE holiday_flag                                                  
       END AS holiday_flag,                                                    
       round(CAST(sum(passenger_up)                                            
                  AS double precision) /                                       
             CAST(count(DISTINCT plan_date)                                    
                  AS double precision)) AS passenger_up                        
FROM ext_passenger_flow_tag                                                    
WHERE line_no = 1                                                              
  AND plan_date BETWEEN DATE '2019-01-31' AND DATE '2019-02-20'                
GROUP BY line_no, holiday_flag;

我注意到执行计划是针对另一个查询的,您能解决这个问题吗?

在不知道实际执行计划的情况下很难提供进一步的建议。

我还想知道索引的定义,看看它是否是最佳的。