Postgres多列索引是否用于OR查询?

时间:2019-10-04 09:22:48

标签: postgresql

这似乎是一个简单的问题,但我无法在线找到答案。

我正在使用Postgres 9.4并具有此表:

                                                 Table "public.title"
             Column              |          Type           | Collation | Nullable |              Default              
---------------------------------+-------------------------+-----------+----------+-----------------------------------
 id                              | integer                 |           | not null | nextval('title_id_seq'::regclass) 
 name1                           | character varying(1000) |           |          | 
 name2                           | character varying(1000) |           |          | 
 name3                           | character varying(1000) |           |          | 
 name4                           | character varying(1000) |           |          | 

我有一个多列索引:

"idx_title_names" btree (name1, name2, name3, name4)

但对于OR查询,则不使用索引:

EXPLAIN ANALYZE SELECT * FROM "title" WHERE ("title"."name1" = 'foo'
   OR "title"."name3" = 'foo' OR "title"."name3" = 'foo' OR "title"."name4" = 'foo');

 Gather  (cost=1000.00..436451.46 rows=659 width=4500) (actual time=561.418..1297.877 rows=3222 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on title  (cost=0.00..435385.56 rows=275 width=4500) (actual time=551.627..1286.724 rows=1074 loops=3)
         Filter: (((name1)::text = 'foo'::text) OR ((name2)::text = 'foo'::text) OR ((name3)::text = 'foo'::text) OR ((name4)::text = 'foo'::text))
         Rows Removed by Filter: 1231911
 Planning Time: 0.102 ms
 Execution Time: 1298.148 ms

这是因为这些索引不适用于OR查询吗?

而且:如果是这样,我最好的选择就是创建4个独立的标准索引吗?

3 个答案:

答案 0 :(得分:2)

一种选择是在列的数组上创建GIN索引,然后使用数组运算符:

create index on title using gin (array[name1,name2,name3,name4]);

然后使用

SELECT * 
FROM title 
WHERE array[name1,name2,name3,name4] @> array['foo'];

请注意,与BTree索引相比,维护GIN索引的成本更高。

答案 1 :(得分:1)

OR is often a performance problem在SQL中。

该索引不能用于这样的条件。

您最好的选择是创建四个单列索引,并希望有一个 Bitmap Or

CREATE INDEX ON public.title (name1);
CREATE INDEX ON public.title (name2);
CREATE INDEX ON public.title (name3);
CREATE INDEX ON public.title (name4);

答案 2 :(得分:1)

(col1, col2, col3, etc)上具有索引,它将用于col1col1col2col1,{{1}上的条件/排序}和col2等。例如,仅在col3上不会用于条件/排序。

看看这个:

col3

但是,当您使用# create table t as select random() as a, random() as b from generate_series(1,1000000); # create index i on t(a,b); # analyze t; # explain analyze select * from t where a > 0.9; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t (cost=2246.83..8863.15 rows=96826 width=16) (actual time=10.973..28.023 rows=99311 loops=1) Recheck Cond: (a > '0.9'::double precision) Heap Blocks: exact=5406 -> Bitmap Index Scan on i (cost=0.00..2222.62 rows=96826 width=0) (actual time=10.251..10.252 rows=99311 loops=1) Index Cond: (a > '0.9'::double precision) Planning Time: 0.348 ms Execution Time: 31.054 ms # explain analyze select * from t where b > 0.9; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..17906.00 rows=99117 width=16) (actual time=0.015..70.505 rows=100137 loops=1) Filter: (b > '0.9'::double precision) Rows Removed by Filter: 899863 Planning Time: 0.090 ms Execution Time: 73.656 ms 条件时,DBMS实际上应该执行几个查询,因为我们的示例or等于select * from t where a > 0.9 or b > 0.9(可以使用索引)和{{1} }(无法使用索引),因此DBMS仅执行一个操作(扫描整个表),而不是两个操作(扫描索引然后扫描整个表)

希望它解释了为什么索引不用于查询的原因。