(TimescaleDB)-如何在约20亿行的数据集上优化JOIN查询。 (现在太慢了)

时间:2019-04-02 08:58:20

标签: postgresql join indexing query-performance timescaledb

我有一个数据库,该数据库可获取大量传感器数据(例如类似IOT)-并将这些数据存储在名为events的表中。 (下面的结构)

现在,用户还可以订阅特定信号的更改(按ID),并且可以请求从特定点开始的所有信号事件(也只有他们订阅的信号)。

问题在于此查询确实很慢(超过1分钟),我不明白为什么它这么慢。我是否缺少各种索引?

我要执行的查询:

SELECT events.*
FROM Events
   JOIN subscription_signal
      ON subscription_signal.subscription_id = $1 AND
         events.signal_id = subscription_signal.signal_id
WHERE source_time > $2
ORDER BY source_time ASC
LIMIT $3
create table if not exists events
(
    source_time timestamp with time zone not null,
    added_time timestamp with time zone default now(),
    signal_id integer not null,
    data_type text,
    analog_value numeric(18,4) default 0,
    digital_value boolean default false
);

create index if not exists events_source_time_idx
    on new_events (source_time desc);

create index if not exists vents_signal_id_source_time_index
    on new_events (signal_id, source_time);

create index if not exists events_source_time_signal_id_index
    on new_events (source_time, signal_id);

create index if not exists events_signal_id_index
    on new_events (signal_id);

create table if not exists subscription_signal
(
    subscription_id integer not null,
    signal_id integer not null,
    constraint subscription_signal_subscription_id_signal_id_uindex
        unique (subscription_id, signal_id)
);

我已经尝试添加多个索引,但是似乎并没有改善查询时间。

下面是一个EXPLAIN ANALYZE,它使用的subscription_id为11,在signal_id表中有114个subscription_signal

EXPLAIN ANALYZE

Limit  (cost=0.71..0.71 rows=500 width=33) (actual time=24018.215..372251.399 rows=274 loops=1)
  ->  Nested Loop  (cost=0.71..0.30 rows=72839547 width=33) (actual time=24018.214..372251.287 rows=274 loops=1)
        ->  Append  (cost=0.43..0.00 rows=0 width=33) (actual time=0.038..141484.028 rows=133615131 loops=1)
              ->  Index Scan Backward using _hyper_1_2_chunk_events_source_timestamp_idx on _hyper_1_2_chunk  (cost=0.43..184291.13 rows=5852043 width=32) (actual time=0.038..9359.166 rows=5852114 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_3_chunk_events_source_timestamp_idx on _hyper_1_3_chunk  (cost=0.43..119619.56 rows=3733367 width=33) (actual time=0.096..4841.893 rows=3733367 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_4_chunk_events_source_timestamp_idx on _hyper_1_4_chunk  (cost=0.43..101122.76 rows=3188104 width=33) (actual time=0.065..2088.177 rows=3188104 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_5_chunk_events_source_timestamp_idx on _hyper_1_5_chunk  (cost=0.43..103410.32 rows=3223702 width=33) (actual time=0.055..3959.425 rows=3223702 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_6_chunk_events_source_timestamp_idx on _hyper_1_6_chunk  (cost=0.43..130778.92 rows=4116765 width=33) (actual time=0.039..2917.270 rows=4116765 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_7_chunk_events_source_timestamp_idx on _hyper_1_7_chunk  (cost=0.43..160960.05 rows=5047161 width=33) (actual time=0.030..3801.518 rows=5047161 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_8_chunk_events_source_timestamp_idx on _hyper_1_8_chunk  (cost=0.43..161138.72 rows=5067119 width=33) (actual time=0.046..3106.090 rows=5067119 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_9_chunk_events_source_timestamp_idx on _hyper_1_9_chunk  (cost=0.43..148498.73 rows=4523834 width=33) (actual time=0.050..4114.502 rows=4535789 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_10_chunk_events_source_timestamp_idx on _hyper_1_10_chunk  (cost=0.43..149195.92 rows=4690233 width=33) (actual time=0.029..4904.397 rows=4690344 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_11_chunk_events_source_timestamp_idx on _hyper_1_11_chunk  (cost=0.43..148274.33 rows=4627748 width=33) (actual time=0.034..4086.343 rows=4645264 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_12_chunk_events_source_timestamp_idx on _hyper_1_12_chunk  (cost=0.43..147687.64 rows=4608886 width=33) (actual time=0.054..4305.309 rows=4612240 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_13_chunk_events_source_timestamp_idx on _hyper_1_13_chunk  (cost=0.43..137341.97 rows=4280236 width=33) (actual time=0.049..6224.372 rows=4301055 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_14_chunk_events_source_timestamp_idx on _hyper_1_14_chunk  (cost=0.43..141762.95 rows=4460189 width=33) (actual time=0.054..2724.318 rows=4460189 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_15_chunk_events_source_timestamp_idx on _hyper_1_15_chunk  (cost=0.43..143731.09 rows=4522774 width=33) (actual time=0.029..3981.750 rows=4522875 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_16_chunk_events_source_timestamp_idx on _hyper_1_16_chunk  (cost=0.43..150538.02 rows=4721239 width=33) (actual time=0.062..6109.135 rows=4722110 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_17_chunk_events_source_timestamp_idx on _hyper_1_17_chunk  (cost=0.43..142216.56 rows=4475544 width=33) (actual time=0.060..3955.418 rows=4475511 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_18_chunk_events_source_timestamp_idx on _hyper_1_18_chunk  (cost=0.43..137131.47 rows=4293396 width=33) (actual time=0.072..5002.332 rows=4294763 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_19_chunk_events_source_timestamp_idx on _hyper_1_19_chunk  (cost=0.43..136908.18 rows=4284368 width=33) (actual time=0.100..5866.330 rows=4279079 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_20_chunk_events_source_timestamp_idx on _hyper_1_20_chunk  (cost=0.43..142385.12 rows=4475205 width=33) (actual time=0.081..3407.288 rows=4475211 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_21_chunk_events_source_timestamp_idx on _hyper_1_21_chunk  (cost=0.43..154334.53 rows=4854714 width=33) (actual time=0.082..3049.434 rows=4855427 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_22_chunk_events_source_timestamp_idx on _hyper_1_22_chunk  (cost=0.43..149148.27 rows=4690356 width=33) (actual time=0.051..5394.662 rows=4690489 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_23_chunk_events_source_timestamp_idx on _hyper_1_23_chunk  (cost=0.43..147018.61 rows=4624741 width=33) (actual time=0.042..2781.965 rows=4624838 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_24_chunk_events_source_timestamp_idx on _hyper_1_24_chunk  (cost=0.43..151841.97 rows=4774145 width=33) (actual time=0.053..5188.953 rows=4776442 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_25_chunk_events_source_timestamp_idx on _hyper_1_25_chunk  (cost=0.43..141470.65 rows=4449075 width=32) (actual time=0.066..4113.208 rows=4451125 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_26_chunk_events_source_timestamp_idx on _hyper_1_26_chunk  (cost=0.43..133228.19 rows=4192152 width=32) (actual time=0.073..3794.901 rows=4192089 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_27_chunk_events_source_timestamp_idx on _hyper_1_27_chunk  (cost=0.43..93412.39 rows=2946643 width=33) (actual time=0.034..2600.225 rows=2946741 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_28_chunk_events_source_timestamp_idx on _hyper_1_28_chunk  (cost=0.43..107666.24 rows=3380023 width=33) (actual time=0.061..4010.668 rows=3380023 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_29_chunk_events_source_timestamp_idx on _hyper_1_29_chunk  (cost=0.43..139070.65 rows=4376229 width=33) (actual time=0.058..3597.805 rows=4376329 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_30_chunk_events_source_timestamp_idx on _hyper_1_30_chunk  (cost=0.43..133595.88 rows=4204134 width=33) (actual time=0.047..3998.751 rows=4204442 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_31_chunk_events_source_timestamp_idx on _hyper_1_31_chunk  (cost=0.43..139554.54 rows=4391212 width=33) (actual time=0.054..4415.468 rows=4391126 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_32_chunk_events_source_timestamp_idx on _hyper_1_32_chunk  (cost=0.43..78562.22 rows=2463045 width=33) (actual time=0.054..1672.220 rows=2482753 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_1_chunk_events_source_timestamp_idx on _hyper_1_1_chunk  (cost=0.14..5.59 rows=62 width=32) (actual time=12.567..12.603 rows=62 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_65_chunk_events_source_timestamp_idx on _hyper_1_65_chunk  (cost=0.15..16.92 rows=267 width=73) (actual time=8.766..8.768 rows=1 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_88_chunk_events_source_timestamp_idx on _hyper_1_88_chunk  (cost=0.15..12.90 rows=364 width=32) (actual time=7.869..8.008 rows=374 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
              ->  Index Scan Backward using _hyper_1_89_chunk_events_source_timestamp_idx on _hyper_1_89_chunk  (cost=0.14..6.16 rows=94 width=32) (actual time=0.017..0.047 rows=108 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
        ->  Index Only Scan using subscription_signal_subscription_id_signal_id_uindex on subscription_signal  (cost=0.28..0.30 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=133615131)
              Index Cond: ((subscription_id = 11) AND (signal_id = _hyper_1_2_chunk.signal_id))
              Heap Fetches: 274
Planning time: 8.924 ms
Execution time: 372252.454 ms

EXPLAIN (ANALYZE, BUFFERS)

Limit  (cost=0.71..0.71 rows=500 width=33) (actual time=19504.097..315933.258 rows=274 loops=1)
  Buffers: shared hit=269362667 read=1352546
  ->  Nested Loop  (cost=0.71..0.30 rows=72839557 width=33) (actual time=19504.096..315933.158 rows=274 loops=1)
        Buffers: shared hit=269362667 read=1352546
        ->  Append  (cost=0.43..0.00 rows=0 width=33) (actual time=14.950..96402.213 rows=133615135 loops=1)
              Buffers: shared hit=2132123 read=1352546
              ->  Index Scan Backward using _hyper_1_2_chunk_events_source_timestamp_idx on _hyper_1_2_chunk  (cost=0.43..184291.13 rows=5852043 width=32) (actual time=0.019..4707.774 rows=5852114 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=265328 read=20465
              ->  Index Scan Backward using _hyper_1_3_chunk_events_source_timestamp_idx on _hyper_1_3_chunk  (cost=0.43..119619.56 rows=3733367 width=33) (actual time=0.057..2192.118 rows=3733367 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=72014 read=25960
              ->  Index Scan Backward using _hyper_1_4_chunk_events_source_timestamp_idx on _hyper_1_4_chunk  (cost=0.43..101122.76 rows=3188104 width=33) (actual time=0.064..1964.159 rows=3188104 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=60389 read=23035
              ->  Index Scan Backward using _hyper_1_5_chunk_events_source_timestamp_idx on _hyper_1_5_chunk  (cost=0.43..103410.32 rows=3223702 width=33) (actual time=0.033..1769.853 rows=3223702 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=61428 read=23456
              ->  Index Scan Backward using _hyper_1_6_chunk_events_source_timestamp_idx on _hyper_1_6_chunk  (cost=0.43..130778.92 rows=4116765 width=33) (actual time=0.029..2138.480 rows=4116765 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=72968 read=30233
              ->  Index Scan Backward using _hyper_1_7_chunk_events_source_timestamp_idx on _hyper_1_7_chunk  (cost=0.43..160960.05 rows=5047161 width=33) (actual time=0.029..2628.668 rows=5047161 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=93741 read=35257
              ->  Index Scan Backward using _hyper_1_8_chunk_events_source_timestamp_idx on _hyper_1_8_chunk  (cost=0.43..161138.72 rows=5067119 width=33) (actual time=0.036..2523.536 rows=5067119 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=93406 read=35384
              ->  Index Scan Backward using _hyper_1_9_chunk_events_source_timestamp_idx on _hyper_1_9_chunk  (cost=0.43..148498.73 rows=4523834 width=33) (actual time=0.024..2890.820 rows=4535789 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=86274 read=32252
              ->  Index Scan Backward using _hyper_1_10_chunk_events_source_timestamp_idx on _hyper_1_10_chunk  (cost=0.43..149195.92 rows=4690233 width=33) (actual time=0.025..2765.170 rows=4690344 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=84883 read=37839
              ->  Index Scan Backward using _hyper_1_11_chunk_events_source_timestamp_idx on _hyper_1_11_chunk  (cost=0.43..148274.33 rows=4627748 width=33) (actual time=0.027..2902.346 rows=4645264 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=64046 read=59908
              ->  Index Scan Backward using _hyper_1_12_chunk_events_source_timestamp_idx on _hyper_1_12_chunk  (cost=0.43..147687.64 rows=4608886 width=33) (actual time=0.049..3556.359 rows=4612240 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=63025 read=59693
              ->  Index Scan Backward using _hyper_1_13_chunk_events_source_timestamp_idx on _hyper_1_13_chunk  (cost=0.43..137341.97 rows=4280236 width=33) (actual time=0.027..3361.873 rows=4301055 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=54986 read=55571
              ->  Index Scan Backward using _hyper_1_14_chunk_events_source_timestamp_idx on _hyper_1_14_chunk  (cost=0.43..141762.95 rows=4460189 width=33) (actual time=0.028..2859.309 rows=4460189 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=50767 read=57846
              ->  Index Scan Backward using _hyper_1_15_chunk_events_source_timestamp_idx on _hyper_1_15_chunk  (cost=0.43..143731.09 rows=4522774 width=33) (actual time=0.028..3143.475 rows=4522875 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=54841 read=58390
              ->  Index Scan Backward using _hyper_1_16_chunk_events_source_timestamp_idx on _hyper_1_16_chunk  (cost=0.43..150538.02 rows=4721239 width=33) (actual time=0.028..2854.384 rows=4722110 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=56488 read=61480
              ->  Index Scan Backward using _hyper_1_17_chunk_events_source_timestamp_idx on _hyper_1_17_chunk  (cost=0.43..142216.56 rows=4475544 width=33) (actual time=0.060..2522.463 rows=4475511 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=54403 read=57798
              ->  Index Scan Backward using _hyper_1_18_chunk_events_source_timestamp_idx on _hyper_1_18_chunk  (cost=0.43..137131.47 rows=4293396 width=33) (actual time=0.051..2854.544 rows=4294763 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=49427 read=56050
              ->  Index Scan Backward using _hyper_1_19_chunk_events_source_timestamp_idx on _hyper_1_19_chunk  (cost=0.43..136908.18 rows=4284368 width=33) (actual time=0.033..2673.792 rows=4279079 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=49055 read=55966
              ->  Index Scan Backward using _hyper_1_20_chunk_events_source_timestamp_idx on _hyper_1_20_chunk  (cost=0.43..142385.12 rows=4475205 width=33) (actual time=0.029..2793.156 rows=4475211 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=60428 read=57064
              ->  Index Scan Backward using _hyper_1_21_chunk_events_source_timestamp_idx on _hyper_1_21_chunk  (cost=0.43..154334.53 rows=4854714 width=33) (actual time=0.022..2701.886 rows=4855427 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=63812 read=62451
              ->  Index Scan Backward using _hyper_1_22_chunk_events_source_timestamp_idx on _hyper_1_22_chunk  (cost=0.43..149148.27 rows=4690356 width=33) (actual time=0.028..2763.383 rows=4690489 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=62684 read=60247
              ->  Index Scan Backward using _hyper_1_23_chunk_events_source_timestamp_idx on _hyper_1_23_chunk  (cost=0.43..147018.61 rows=4624741 width=33) (actual time=0.025..2823.549 rows=4624838 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=52489 read=59932
              ->  Index Scan Backward using _hyper_1_24_chunk_events_source_timestamp_idx on _hyper_1_24_chunk  (cost=0.43..151841.97 rows=4774145 width=33) (actual time=0.060..2819.668 rows=4776442 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=48917 read=62622
              ->  Index Scan Backward using _hyper_1_25_chunk_events_source_timestamp_idx on _hyper_1_25_chunk  (cost=0.43..141470.65 rows=4449075 width=32) (actual time=0.057..2554.301 rows=4451125 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=40129 read=58563
              ->  Index Scan Backward using _hyper_1_26_chunk_events_source_timestamp_idx on _hyper_1_26_chunk  (cost=0.43..133228.19 rows=4192152 width=32) (actual time=0.049..2467.922 rows=4192089 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=55849 read=37756
              ->  Index Scan Backward using _hyper_1_27_chunk_events_source_timestamp_idx on _hyper_1_27_chunk  (cost=0.43..93412.39 rows=2946643 width=33) (actual time=0.048..1754.935 rows=2946741 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=47717 read=23505
              ->  Index Scan Backward using _hyper_1_28_chunk_events_source_timestamp_idx on _hyper_1_28_chunk  (cost=0.43..107666.24 rows=3380023 width=33) (actual time=0.027..1896.157 rows=3380023 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=66850 read=23284
              ->  Index Scan Backward using _hyper_1_29_chunk_events_source_timestamp_idx on _hyper_1_29_chunk  (cost=0.43..139070.65 rows=4376229 width=33) (actual time=0.028..2469.809 rows=4376329 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=77118 read=31953
              ->  Index Scan Backward using _hyper_1_30_chunk_events_source_timestamp_idx on _hyper_1_30_chunk  (cost=0.43..133595.88 rows=4204134 width=33) (actual time=0.032..2862.879 rows=4204442 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=64898 read=33521
              ->  Index Scan Backward using _hyper_1_31_chunk_events_source_timestamp_idx on _hyper_1_31_chunk  (cost=0.43..139554.54 rows=4391212 width=33) (actual time=0.038..2431.625 rows=4391126 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=68406 read=34357
              ->  Index Scan Backward using _hyper_1_32_chunk_events_source_timestamp_idx on _hyper_1_32_chunk  (cost=0.43..78562.22 rows=2463045 width=33) (actual time=0.060..1396.815 rows=2482753 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=35270 read=20706
              ->  Index Scan Backward using _hyper_1_1_chunk_events_source_timestamp_idx on _hyper_1_1_chunk  (cost=0.14..5.59 rows=62 width=32) (actual time=0.030..0.095 rows=62 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=4 read=1
              ->  Index Scan Backward using _hyper_1_65_chunk_events_source_timestamp_idx on _hyper_1_65_chunk  (cost=0.15..16.92 rows=267 width=73) (actual time=0.042..0.043 rows=1 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=2 read=1
              ->  Index Scan Backward using _hyper_1_88_chunk_events_source_timestamp_idx on _hyper_1_88_chunk  (cost=0.15..12.90 rows=364 width=32) (actual time=0.022..0.140 rows=374 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=72
              ->  Index Scan Backward using _hyper_1_89_chunk_events_source_timestamp_idx on _hyper_1_89_chunk  (cost=0.14..6.48 rows=112 width=32) (actual time=0.019..0.053 rows=112 loops=1)
                    Index Cond: (source_timestamp > '2018-07-05 09:00:00.88+00'::timestamp with time zone)
                    Buffers: shared hit=9
        ->  Index Only Scan using subscription_signal_subscription_id_signal_id_uindex on subscription_signal  (cost=0.28..0.30 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=133615135)
              Index Cond: ((subscription_id = 11) AND (signal_id = _hyper_1_2_chunk.signal_id))
              Heap Fetches: 274
              Buffers: shared hit=267230544
Planning time: 144.220 ms
Execution time: 315933.896 ms

4 个答案:

答案 0 :(得分:1)

events没有被source_timestamp分区,对吧?

PostgreSQL别无选择,只能用source_timestamp > '2018-07-05 09:00:00.88+00'查找所有行,然后用events联接结果。那永远不会很快。

如果省略LIMIT可以使查询更快,则可以通过使用嵌套循环联接将PostgreSQL朝向 not 来获得一些好处,但是考虑到涉及的行数,这不会似乎有可能。

如果表被source_timestamp分区,则查询可能会更快。

答案 1 :(得分:0)

尝试删除除(signal_id, source_time)上的索引以外的所有索引,并使用以下查询:

SELECT *
FROM events
WHERE signal_id IN (
    SELECT signal_id
    FROM subscription_signal
    WHERE subscription_id = $1)
AND source_time > $2
ORDER BY source_time
LIMIT $3

然后,PostgreSQL应该使用剩余的索引快速查找给定signal_id列表的行,然后使用索引的第二部分来查找带有source_time > $2的行。

在以下情况下查询可能会运行缓慢:

  • 如果匹配的行数太多(例如,超过表中总行数的10%)。在这种情况下,PostgreSQL可能会选择全表扫描而不是索引查找。
  • 如果匹配的行数适中(<1M),但这些行在物理上分散在慢速磁盘(HDD,Amazon,Google Cloud或Microsoft Azure中的网络连接的持久性存储)上的表存储中。在这种情况下,性能将受到磁盘iops和读取带宽的限制。

答案 2 :(得分:0)

我不知道您是否已经解决了您的问题,但是当我遇到此问题时,您可能需要自己查看解释输出:https://explain.depesz.com/s/4GYT

我还不知道为什么(但是),但是查询计划者应该生成一个MergeAppend而不是Append执行节点,以便能够将LIMIT下推到每个分区中(索引)扫描,就像在我的测试设置中一样:https://explain.depesz.com/s/Fi9C

为了进行比较,同一查询没有限制以显示我在这里使用的测试数据量:https://explain.depesz.com/s/AJ6 (是的,只有37M中间行,但是显示了效果) < / p>

对于急躁的人数:

          Without LIMIT: 49,174.946 milliseconds
LIMIT 274 OFFSET 100000: 35,022.269 milliseconds
              LIMIT 274:      2.710 milliseconds

LIMIT的不同值以及查询参数代替常量也不会导致任何不同的计划)

MergeAppend的这种优化自9.1版(请参见提交6fbc323)以来就已经存在于PostgreSQL中,无论是否使用分区,Timescale都使用分区,但是对于简单的Append不存在这种情况,这告诉我是这里的关键瓶颈。反过来,这可能会受到最新统计信息的影响(您在两个表上都使用过analyze,还是?)。

即将发布的PostgreSQL 12版本应该还将LIMIT ... OFFSET ...下移到具有提交959d00eAppend执行节点中。

OTOH,即使在具有2+十亿行的时间刻度优化表中,每个分区有10+百万行,时间也可以花费在I / O上。要对此进行测试,请执行以下操作并重新创建一个EXPLAIN (ANALYZE, BUFFERS, COSTS OFF)计划:

SET track_io_timing TO ON;

然后在执行计划中包括每个相关节点上I / O操作的时间(总和),例如:

 Limit (actual time=1546.517..1546.572 rows=274 loops=1)
   Buffers: shared hit=73438 read=23452
   I/O Timings: read=1316.373
   ->  Sort (actual time=1546.512..1546.536 rows=274 loops=1)
         Sort Key: e.source_time
         Sort Method: top-N heapsort  Memory: 46kB
         Buffers: shared hit=73438 read=23452
         I/O Timings: read=1316.373
         ->  Nested Loop (actual time=7.147..1524.251 rows=69293 loops=1)
               Buffers: shared hit=73438 read=23452
               I/O Timings: read=1316.373
               ->  Index Only Scan using subscription_signal_subscription_id_signal_id_uindex on subscription_signal (actual time=0.038..0.236 rows=114 loops=1)
                     Index Cond: (subscription_id = $1)
                     Heap Fetches: 114
                     Buffers: shared hit=4
               ->  Append (actual time=0.334..13.184 rows=608 loops=114)
                     Buffers: shared hit=73434 read=23452
                     I/O Timings: read=1316.373
                     ->  Seq Scan on events e (actual time=0.000..0.000 rows=0 loops=114)
                           Filter: ((source_time > $2) AND (subscription_signal.signal_id = signal_id))
                     ->  Index Scan using _hyper_5_13094_chunk_events_signal_id_source_time_index on _hyper_5_13094_chunk e_1 (actual time=0.014..0.014 rows=0 loops=114)
                           Index Cond: ((signal_id = subscription_signal.signal_id) AND (source_time > $2))
                           Buffers: shared hit=340 read=3
                           I/O Timings: read=0.851
                     ->  Index Scan using _hyper_5_13095_chunk_events_signal_id_source_time_index on _hyper_5_13095_chunk e_2 (actual time=0.009..0.009 rows=0 loops=114)
                           Index Cond: ((signal_id = subscription_signal.signal_id) AND (source_time > $2))
                           Buffers: shared hit=335 read=7
                           I/O Timings: read=0.398
...

这对于检测查询实际上是CPU约束还是I / O约束(这是更大的瓶颈,因此是改善状况的最佳投资)非常有用。

减少中间结果行

或者,最重要的部分是大大减少中间行的数量。正如我们所看到的,您实际返回的行数是符合条件的行的〜0.000205%,我真的敦促您重新考虑您的条件。

在时间序列查询中,您几乎永远不会只使用下限进行过滤,而是想出一个合理的间隔,该间隔会导致在所有情况下95%的情况下有足够的行返回LIMIT,并且对于其他5%(行数不足)的用户,只需使用后续时间范围再次查询即可。

因此,只需尝试以下查询即可:

SELECT events.*
FROM Events
   JOIN subscription_signal
      ON subscription_signal.subscription_id = $1 AND
         events.signal_id = subscription_signal.signal_id
WHERE source_time > $2 AND source_time <= ($2 + '1 week'::interval)
ORDER BY source_time ASC
LIMIT $3

仅此一项就可以大大减少查询时间。

答案 3 :(得分:-1)

尝试这样的事情:

SELECT *
FROM events
WHERE signal_id IN ( SELECT signal_id
                     FROM subscription_signal
                    WHERE subscription_id = $1
                      AND source_time > $2 )
ORDER BY source_time ASC
LIMIT $3