为什么不同的查询值会产生不同的索引算法?

时间:2018-07-10 21:14:54

标签: postgresql

我有一个查询,并且专门为此查询创建了索引。但是我发现,如果使用某些特定值,查询将停止快速运行,并进行全面扫描。

这是快速执行的情况:

explain analyze SELECT                                          
                         v.valtr_id,                             
                         v.block_num,                            
                         v.from_id,                              
                         v.to_id,                                
                         v.from_balance::text,                   
                         v.to_balance::text                      
                 FROM value_transfer v                           
                 WHERE                                           
                         (v.block_num<=2748053) AND              
                         (                                       
                                 (v.to_id=639291) OR             
                                 (v.from_id=639291)              
                         )                                       
                 ORDER BY                                        
                         v.block_num DESC,v.valtr_id DESC        
                 LIMIT 1 

 Limit  (cost=23054.03..23054.03 rows=1 width=30) (actual time=1.464..1.465 rows=1 loops=1)
   ->  Sort  (cost=23054.03..23068.94 rows=5964 width=30) (actual time=1.462..1.462 rows=1 loops=1)
         Sort Key: block_num DESC, valtr_id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Bitmap Heap Scan on value_transfer v  (cost=144.85..23024.21 rows=5964 width=30) (actual time=1.397..1.437 rows=3 loops=1)
               Recheck Cond: ((to_id = 639291) OR (from_id = 639291))
               Filter: (block_num <= 2748053)
               Heap Blocks: exact=3
               ->  BitmapOr  (cost=144.85..144.85 rows=5964 width=0) (actual time=1.339..1.339 rows=0 loops=1)
                     ->  Bitmap Index Scan on vt_to_id_idx  (cost=0.00..40.42 rows=1580 width=0) (actual time=0.755..0.755 rows=1 loops=1)
                           Index Cond: (to_id = 639291)
                     ->  Bitmap Index Scan on vt_from_id_idx  (cost=0.00..101.45 rows=4384 width=0) (actual time=0.580..0.580 rows=2 loops=1)
                           Index Cond: (from_id = 639291)
 Planning time: 0.499 ms
 Execution time: 1.556 ms
(15 rows)

但是,如果我将值199658作为查询的输入,它将使用不同的搜索算法:

explain analyze SELECT                                          
                         v.valtr_id,                             
                         v.block_num,                            
                         v.from_id,                              
                         v.to_id,                                
                         v.from_balance::text,                   
                         v.to_balance::text                      
                 FROM value_transfer v                           
                 WHERE                                           
                         (v.block_num<=2748053) AND              
                         (                                       
                                 (v.to_id=199658) OR             
                                 (v.from_id=199658)              
                         )                                       
                 ORDER BY                                        
                         v.block_num DESC,v.valtr_id DESC        
                 LIMIT 1          ;

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..6462.99 rows=1 width=30) (actual time=614109.855..614109.856 rows=1 loops=1)
   ->  Index Scan Backward using bnum_valtr_idx on value_transfer v  (cost=0.57..200845479.66 rows=31079 width=30) (actual time=614109.853..614109.853 rows=1 loops=1)
         Index Cond: (block_num <= 2748053)
         Filter: ((to_id = 199658) OR (from_id = 199658))
         Rows Removed by Filter: 101190609
 Planning time: 0.515 ms
 Execution time: 614109.920 ms
(7 rows)

为什么会这样?我认为,一旦为查询创建了索引,执行将始终采用相同的路径,但事实并非如此。如何确保Postgres在每次搜索中始终使用相同的算法?

我什至以为会发生这种情况,因为可能不是很干净地建立了索引并重建了主索引:

postgres=> drop index bnum_valtr_idx;
DROP INDEX
postgres=>  CREATE INDEX bnum_valtr_idx ON public.value_transfer USING btree (block_num DESC, valtr_id DESC);
CREATE INDEX
postgres=> 

但是,这没有任何改变。

我的表定义是:

CREATE TABLE value_transfer (
    valtr_id            BIGSERIAL       PRIMARY KEY,
    tx_id               BIGINT          REFERENCES transaction(tx_id) ON DELETE CASCADE ON UPDATE CASCADE,
    block_id            INT             REFERENCES block(block_id) ON DELETE CASCADE ON UPDATE CASCADE,
    block_num           INT             NOT NULL,
    from_id             INT             NOT NULL,
    to_id               INT             NOT NULL,
    value               NUMERIC         DEFAULT 0,
    from_balance        NUMERIC         DEFAULT 0,
    to_balance          NUMERIC         DEFAULT 0,
    kind                CHAR            NOT NULL,
    depth               INT             DEFAULT 0,
    error               TEXT            NOT NULL
);

postgres=> SELECT * FROM pg_indexes WHERE tablename = 'value_transfer';
 schemaname |   tablename    |      indexname      | tablespace |                                             indexdef                                             
------------+----------------+---------------------+------------+--------------------------------------------------------------------------------------------------
 public     | value_transfer | bnum_valtr_idx      |            | CREATE INDEX bnum_valtr_idx ON public.value_transfer USING btree (block_num DESC, valtr_id DESC)
 public     | value_transfer | value_transfer_pkey |            | CREATE UNIQUE INDEX value_transfer_pkey ON public.value_transfer USING btree (valtr_id)
 public     | value_transfer | vt_tx_from_idx      |            | CREATE INDEX vt_tx_from_idx ON public.value_transfer USING btree (tx_id)
 public     | value_transfer | vt_block_num_idx    |            | CREATE INDEX vt_block_num_idx ON public.value_transfer USING btree (block_num)
 public     | value_transfer | vt_from_id_idx      |            | CREATE INDEX vt_from_id_idx ON public.value_transfer USING btree (from_id)
 public     | value_transfer | vt_to_id_idx        |            | CREATE INDEX vt_to_id_idx ON public.value_transfer USING btree (to_id)
 public     | value_transfer | vt_block_id_idx     |            | CREATE INDEX vt_block_id_idx ON public.value_transfer USING btree (block_id)
(7 rows)

postgres=> 

1 个答案:

答案 0 :(得分:0)

可能是一个值在一个栏中,反之亦然。无论如何,在不同列上使用OR会导致性能问题而臭名昭著,因为查询计划只能使用一个索引,但是OR要求使用两个索引来检查两列都很快,因此将使用其索引检查一列,但另一列需要进行扫描。

解决此问题的方法是将查询分解为一个联合。

尝试一下:

SELECT * FROM (
  SELECT                                          
    valtr_id,                             
    block_num,                            
    from_id,                              
    to_id,                                
    from_balance::text,                   
    to_balance::text                      
  FROM value_transfer                         
  WHERE block_num<=2748053
  AND to_id=199658
UNION ALL
  SELECT                                          
    valtr_id,                             
    block_num,                            
    from_id,                              
    to_id,                                
    from_balance::text,                   
    to_balance::text                      
  FROM value_transfer                         
  WHERE block_num<=2748053
  AND from_id=199658
) x
ORDER BY block_num DESC, valtr_id DESC        
LIMIT 1