我有下表,该表根据日期范围进行了分区:
create table volume (
id bigint,
event_ts timestamptz,
lane integer,
measure_line integer,
volume integer,
) partition by range (event_ts)
我创建了一个查询,该查询以15分钟为间隔对某个measure_line上给定车道的流量求和:
with time_interval as (
SELECT start_time, start_time + interval '15 minute' as end_time
FROM generate_series( to_timestamp(1561496400), to_timestamp(1561582799), '15 minute'::interval) as start_time
)
select
extract('epoch' from f.start_time)::int as start_time,
coalesce(sum(volume), 0) as volume
from volume v
right join time_interval f
on v.event_ts > f.start_time and v.event_ts < f.end_time
and v.measure_line = 1
and lane in (1,2,3,4)
group by f.start_time, f.end_time
order by start_time;
有多个分区,每天一个:
...
volume_20190509
volume_20190510
volume_20190511
...
为每个分区创建以下组合索引:
volume_20190518_event_ts_lane_measure_line_idx
查询结果如下:
start_time volume_count
1561496400 58
1561497300 47
1561498200 43
1561499100 49
1561500000 39
1561500900 41
1561501800 47
1561502700 28
1561503600 41
取决于泳道列的IN子句中的参数数量 postgres是使用顺序扫描还是创建索引(我使用“ explain”子句对此进行了检查)。当决定使用顺序扫描时,查询速度至少会降低10倍!我无法理解这种行为!没有明显的理由为什么它不应该总是使用创建的索引。你知道为什么吗?
Postgres版本:PostgreSQL 10.4
例如,当我仅传递一个值时-(1)中的泳道使用顺序扫描。当我传递更多值时-(1,2,3,4)中的车道使用已创建的索引。
编辑:
说明计划输出
(1,2,3,4)中的车道 解释(分析,缓冲)
-> HashAggregate (cost=28395929.50..28395932.50 rows=200 width=28) (actual time=607.714..608.083 rows=96 loops=1)
Group Key: f.start_time, f.end_time
Buffers: shared hit=31658
-> Nested Loop Left Join (cost=0.42..26014632.00 rows=317506333 width=20) (actual time=1.607..514.124 rows=28630 loops=1)
Buffers: shared hit=31658
-> CTE Scan on time_interval f (cost=0.00..20.00 rows=1000 width=16) (actual time=0.057..0.977 rows=96 loops=1)
-> Append (cost=0.42..22839.33 rows=317528 width=12) (actual time=1.068..3.957 rows=298 loops=96)
Buffers: shared hit=31658
-> Index Scan using volume_20190324_event_ts_lane_measure_line_idx on volume_20190324 v (cost=0.42..229.75 rows=3222 width=12) (actual time=0.008..0.008 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
-> Index Scan using volume_20190325_event_ts_lane_measure_line_idx on volume_20190325 v_1 (cost=0.42..228.74 rows=3162 width=12) (actual time=0.008..0.008 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
-> Index Scan using volume_20190326_event_ts_lane_measure_line_idx on volume_20190326 v_2 (cost=0.42..226.53 rows=3122 width=12) (actual time=0.007..0.007 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
-> Index Scan using volume_20190327_event_ts_lane_measure_line_idx on volume_20190327 v_3 (cost=0.42..229.48 rows=3157 width=12) (actual time=0.016..0.016 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
-> Index Scan using volume_20190328_event_ts_lane_measure_line_idx on volume_20190328 v_4 (cost=0.42..228.97 rows=3157 width=12) (actual time=0.007..0.007 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
-> Index Scan using volume_20190329_event_ts_lane_measure_line_idx on volume_20190329 v_5 (cost=0.42..230.02 rows=3227 width=12) (actual time=0.007..0.007 rows=0 loops=96)
Index Cond: ((event_ts > f.start_time) AND (event_ts < f.end_time) AND (measure_line = 1))
Filter: (lane = ANY ('{1,2,3,4}'::integer[]))
Buffers: shared hit=288
...
(1)中的车道-需要5分钟才能完成
Sort (cost=16986442.13..16986442.63 rows=200 width=28) (actual time=332823.872..332824.048 rows=96 loops=1)
Sort Key: f.start_time
Sort Method: quicksort Memory: 32kB
Buffers: shared hit=24 read=113392, temp read=215935 written=2272
CTE time_interval
-> Function Scan on generate_series start_time (cost=0.00..12.50 rows=1000 width=16) (actual time=0.045..0.482 rows=96 loops=1)
-> HashAggregate (cost=16986418.98..16986421.98 rows=200 width=28) (actual time=332823.319..332823.630 rows=96 loops=1)
Group Key: f.start_time, f.end_time
Buffers: shared hit=24 read=113392, temp read=215935 written=2272
-> Nested Loop Left Join (cost=0.00..16386221.49 rows=80026333 width=20) (actual time=12066.893..332801.773 rows=7190 loops=1)
Join Filter: ((v.event_ts > f.start_time) AND (v.event_ts < f.end_time))
Rows Removed by Join Filter: 68689930
Buffers: shared hit=24 read=113392, temp read=215935 written=2272
-> CTE Scan on time_interval f (cost=0.00..20.00 rows=1000 width=16) (actual time=0.058..1.258 rows=96 loops=1)
-> Materialize (cost=0.00..270371.58 rows=720237 width=12) (actual time=0.012..1865.908 rows=715595 loops=96)
Buffers: shared hit=24 read=113392, temp read=215935 written=2272
-> Append (cost=0.00..263253.39 rows=720237 width=12) (actual time=0.115..7524.225 rows=715595 loops=1)
Buffers: shared hit=24 read=113392
-> Seq Scan on volume_20190324 v (cost=0.00..2649.35 rows=7263 width=12) (actual time=0.107..83.392 rows=7205 loops=1)
Filter: ((measure_line = 1) AND (lane = 1))
Rows Removed by Filter: 93285
Buffers: shared hit=1 read=1141
-> Seq Scan on volume_20190325 v_1 (cost=0.00..2644.50 rows=7276 width=12) (actual time=0.047..115.813 rows=7190 loops=1)
Filter: ((measure_line = 1) AND (lane = 1))
Rows Removed by Filter: 93110
Buffers: shared hit=1 read=1139
-> Seq Scan on volume_20190326 v_2 (cost=0.00..2618.95 rows=7002 width=12) (actual time=0.049..77.066 rows=7105 loops=1)
Filter: ((measure_line = 1) AND (lane = 1))
Rows Removed by Filter: 92225
Buffers: shared hit=1 read=1128
-> Seq Scan on volume_20190327 v_3 (cost=0.00..2644.05 rows=7190 width=12) (actual time=0.041..62.538 rows=7195 loops=1)
Filter: ((measure_line = 1) AND (lane = 1))
Rows Removed by Filter: 93075
Buffers: shared hit=1 read=1139
-> Seq Scan on volume_20190328 v_4 (cost=0.00..2644.05 rows=7127 width=12) (actual time=0.026..44.041 rows=7200 loops=1)
Filter: ((measure_line = 1) AND (lane = 1))
Rows Removed by Filter: 93070
Buffers: shared hit=1 read=1139
-> Seq Scan on volume_20190329 v_5 (cost=0.00..2643.90 rows=7236 width=12) (actual time=0.039..44.562 rows=7205 loops=1)
...