为什么postgres根据IN子句中的参数数量来随机决定是否在同一查询中使用索引?

时间:2019-07-01 14:42:51

标签: sql postgresql indexing

我有下表,该表根据日期范围进行了分区:

create table volume (
   id bigint, 
   event_ts timestamptz,
   lane integer,
   measure_line integer,
   volume integer,
) partition by range (event_ts)

我创建了一个查询,该查询以15分钟为间隔对某个measure_line上给定车道的流量求和:

   with time_interval as (
      SELECT start_time, start_time + interval '15 minute' as end_time
      FROM generate_series( to_timestamp(1561496400), to_timestamp(1561582799), '15 minute'::interval) as start_time
   )
   select
     extract('epoch' from f.start_time)::int as start_time,
     coalesce(sum(volume), 0) as volume
   from volume v
   right join time_interval f
   on v.event_ts > f.start_time and v.event_ts < f.end_time
   and v.measure_line = 1
   and lane in (1,2,3,4)
   group by f.start_time, f.end_time
   order by start_time;

有多个分区,每天一个:

...
volume_20190509
volume_20190510
volume_20190511
...

为每个分区创建以下组合索引:

volume_20190518_event_ts_lane_measure_line_idx

查询结果如下:

start_time  volume_count
1561496400  58
1561497300  47
1561498200  43
1561499100  49
1561500000  39
1561500900  41
1561501800  47
1561502700  28
1561503600  41

取决于泳道列的IN子句中的参数数量 postgres是使用顺序扫描还是创建索引(我使用“ explain”子句对此进行了检查)。当决定使用顺序扫描时,查询速度至少会降低10倍!我无法理解这种行为!没有明显的理由为什么它不应该总是使用创建的索引。你知道为什么吗?

Postgres版本:PostgreSQL 10.4

例如,当我仅传递一个值时-(1)中的泳道使用顺序扫描。当我传递更多值时-(1,2,3,4)中的车道使用已创建的索引。

编辑:

说明计划输出

(1,2,3,4)中的车道 解释(分析,缓冲)

  ->  HashAggregate  (cost=28395929.50..28395932.50 rows=200 width=28) (actual time=607.714..608.083 rows=96 loops=1)
        Group Key: f.start_time, f.end_time
        Buffers: shared hit=31658
        ->  Nested Loop Left Join  (cost=0.42..26014632.00 rows=317506333 width=20) (actual time=1.607..514.124 rows=28630 loops=1)
              Buffers: shared hit=31658
              ->  CTE Scan on time_interval f  (cost=0.00..20.00 rows=1000 width=16) (actual time=0.057..0.977 rows=96 loops=1)
              ->  Append  (cost=0.42..22839.33 rows=317528 width=12) (actual time=1.068..3.957 rows=298 loops=96)
                    Buffers: shared hit=31658
                    ->  Index Scan using volume_20190324_event_ts_lane_measure_line_idx on volume_20190324 v  (cost=0.42..229.75 rows=3222 width=12) (actual time=0.008..0.008 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
                    ->  Index Scan using volume_20190325_event_ts_lane_measure_line_idx on volume_20190325 v_1  (cost=0.42..228.74 rows=3162 width=12) (actual time=0.008..0.008 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
                    ->  Index Scan using volume_20190326_event_ts_lane_measure_line_idx on volume_20190326 v_2  (cost=0.42..226.53 rows=3122 width=12) (actual time=0.007..0.007 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
                    ->  Index Scan using volume_20190327_event_ts_lane_measure_line_idx on volume_20190327 v_3  (cost=0.42..229.48 rows=3157 width=12) (actual time=0.016..0.016 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
                    ->  Index Scan using volume_20190328_event_ts_lane_measure_line_idx on volume_20190328 v_4  (cost=0.42..228.97 rows=3157 width=12) (actual time=0.007..0.007 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
                    ->  Index Scan using volume_20190329_event_ts_lane_measure_line_idx on volume_20190329 v_5  (cost=0.42..230.02 rows=3227 width=12) (actual time=0.007..0.007 rows=0 loops=96)
                          Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
                          Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
                          Buffers: shared hit=288
...

(1)中的车道-需要5分钟才能完成

Sort  (cost=16986442.13..16986442.63 rows=200 width=28) (actual time=332823.872..332824.048 rows=96 loops=1)
  Sort Key: f.start_time
  Sort Method: quicksort  Memory: 32kB
  Buffers: shared hit=24 read=113392, temp read=215935 written=2272
  CTE time_interval
    ->  Function Scan on generate_series start_time  (cost=0.00..12.50 rows=1000 width=16) (actual time=0.045..0.482 rows=96 loops=1)
  ->  HashAggregate  (cost=16986418.98..16986421.98 rows=200 width=28) (actual time=332823.319..332823.630 rows=96 loops=1)
        Group Key: f.start_time, f.end_time
        Buffers: shared hit=24 read=113392, temp read=215935 written=2272
        ->  Nested Loop Left Join  (cost=0.00..16386221.49 rows=80026333 width=20) (actual time=12066.893..332801.773 rows=7190 loops=1)
              Join Filter: ((v.event_ts > f.start_time) AND (v.event_ts < f.end_time))
              Rows Removed by Join Filter: 68689930
              Buffers: shared hit=24 read=113392, temp read=215935 written=2272
              ->  CTE Scan on time_interval f  (cost=0.00..20.00 rows=1000 width=16) (actual time=0.058..1.258 rows=96 loops=1)
              ->  Materialize  (cost=0.00..270371.58 rows=720237 width=12) (actual time=0.012..1865.908 rows=715595 loops=96)
                    Buffers: shared hit=24 read=113392, temp read=215935 written=2272
                    ->  Append  (cost=0.00..263253.39 rows=720237 width=12) (actual time=0.115..7524.225 rows=715595 loops=1)
                          Buffers: shared hit=24 read=113392
                          ->  Seq Scan on volume_20190324 v  (cost=0.00..2649.35 rows=7263 width=12) (actual time=0.107..83.392 rows=7205 loops=1)
                                Filter: ((measure_line = 1) AND (lane = 1))
                                Rows Removed by Filter: 93285
                                Buffers: shared hit=1 read=1141
                          ->  Seq Scan on volume_20190325 v_1  (cost=0.00..2644.50 rows=7276 width=12) (actual time=0.047..115.813 rows=7190 loops=1)
                                Filter: ((measure_line = 1) AND (lane = 1))
                                Rows Removed by Filter: 93110
                                Buffers: shared hit=1 read=1139
                          ->  Seq Scan on volume_20190326 v_2  (cost=0.00..2618.95 rows=7002 width=12) (actual time=0.049..77.066 rows=7105 loops=1)
                                Filter: ((measure_line = 1) AND (lane = 1))
                                Rows Removed by Filter: 92225
                                Buffers: shared hit=1 read=1128
                          ->  Seq Scan on volume_20190327 v_3  (cost=0.00..2644.05 rows=7190 width=12) (actual time=0.041..62.538 rows=7195 loops=1)
                                Filter: ((measure_line = 1) AND (lane = 1))
                                Rows Removed by Filter: 93075
                                Buffers: shared hit=1 read=1139
                          ->  Seq Scan on volume_20190328 v_4  (cost=0.00..2644.05 rows=7127 width=12) (actual time=0.026..44.041 rows=7200 loops=1)
                                Filter: ((measure_line = 1) AND (lane = 1))
                                Rows Removed by Filter: 93070
                                Buffers: shared hit=1 read=1139
                          ->  Seq Scan on volume_20190329 v_5  (cost=0.00..2643.90 rows=7236 width=12) (actual time=0.039..44.562 rows=7205 loops=1)
...

0 个答案:

没有答案