我有一张桌子:
CREATE TABLE foo
(
id SERIAL,
bar DATE,
baz DATE
)
有几百万条记录,我希望能够根据值的差异有效地过滤值:
-- either
SELECT * FROM foo WHERE bar WHERE bar = baz;
SELECT * FROM foo WHERE bar WHERE bar != baz;
-- or
SELECT * FROM foo WHERE bar WHERE bar IS NOT DISTINCT FROM baz;
SELECT * FROM foo WHERE bar WHERE bar IS DISTINCT FROM baz;
我尝试了以下索引组合:
CREATE INDEX i_1 ON foo ((bar IS NOT DISTINCT FROM baz));
CREATE INDEX i_2 ON foo ((bar IS DISTINCT FROM baz));
CREATE INDEX i_1 ON foo ((bar = baz));
CREATE INDEX i_2 ON foo ((bar != baz));
,甚至是部分占位符索引:
CREATE INDEX i_1 ON foo (id) WHERE bar IS NOT DISTINCT FROM baz;
CREATE INDEX i_2 ON foo (id) WHERE bar IS DISTINCT FROM baz;
CREATE INDEX i_1 ON foo (id) WHERE bar = baz;
CREATE INDEX i_2 ON foo (id) WHERE bar != baz;
但是由于某些原因,这些索引似乎根本没有被拾取:
EXPLAIN ANALYZE SELECT * FROM foo WHERE bar IS DISTINCT FROM baz
----
QUERY PLAN
Seq Scan on foo (cost=0.00..283185.05 rows=3249594 width=2609) (actual time=0.088..2272.179 rows=2165900 loops=1)
Filter: (bar IS DISTINCT FROM baz)
Rows Removed by Filter: 1100024
Planning time: 0.561 ms
Execution time: 2433.678 ms
和:
EXPLAIN ANALYZE SELECT * FROM foo WHERE bar IS NOT DISTINCT FROM baz
----
QUERY PLAN
Gather (cost=1000.00..262004.02 rows=16330 width=2609) (actual time=0.298..1599.461 rows=1100024 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on foo (cost=0.00..259371.02 rows=6804 width=2609) (actual time=9.244..887.380 rows=366675 loops=3)
Filter: (NOT (bar IS DISTINCT FROM baz))
Rows Removed by Filter: 721967
Planning time: 0.293 ms
Execution time: 1682.527 ms
我正在运行PostgreSQL版本10。似乎在PostgreSQL 12中,我可以使用generated columns,并在生成的值上放置一个索引。不幸的是,我需要在PostgreSQL 10中找到一个解决方案,其中尚不存在生成的列。
我有没有尝试过一些索引变化,在我的情况下可能有用?