我想查询(列表)值或NULL但不使用OR。尝试不使用OR的原因是,我需要在该字段上使用索引来加速查询。
一个简单的例子来说明我的问题:
CREATE TABLE fruits
(
name text,
quantity integer
);
(真实表有很多额外的整数列。)
我不满意的查询是
SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;
我希望的查询类似于
SELECT * FROM fruits WHERE quantity MAGIC (1,2,3,4,NULL);
我正在使用Postgresql 9.1。
据我所知,从文档(例如http://www.postgresql.org/docs/9.1/static/functions-comparisons.html)和测试中无法做到这一点。但我希望你们中的一个人有一些神奇的洞察力。
答案 0 :(得分:1)
丑陋的黑客攻击COALESCE
:
SELECT *
FROM fruits
WHERE COALESCE(quantity,1) IN (1,2,3,4)
;
请检查生成的计划。 IIRC,优化者在这种情况下知道COALESCE()
。
更新:备选:使用EXISTS(NOT EXISTS(NOT IN))
技巧(在此处生成不同的计划)
-- EXPLAIN ANALYZE
SELECT *
FROM fruits fr
WHERE EXISTS (
SELECT * FROM fruits ex
WHERE ex.id = fr.id
AND NOT EXISTS (
SELECT * FROM fruits nx
WHERE nx.id = ex.id
AND nx.quantity NOT IN (1,2,3,4)
)
)
;
BTW:在测试时,(最多100万行,只有4 +几个符合条件),第一个查询(不使用索引)总是比第二个查询快(它使用索引和散列反连接) YMMV。
更新2:原始查询IS NULL OR IN()
在这里显然是赢家:
-- EXPLAIN ANALYZE
SELECT *
FROM fruits
WHERE quantity IS NULL
OR quantity IN (1,2,3,4)
;
答案 1 :(得分:1)
100k行的测试表:
create table fruits (name text, quantity integer);
insert into fruits (name, quantity)
select left(md5(i::text), 6), i
from generate_series(1, 10000) s(i);
使用普通的数量索引:
create index fruits_index on fruits(quantity);
analyze fruits;
or
的查询:
explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4) OR quantity IS NULL;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on fruits (cost=21.29..34.12 rows=4 width=11) (actual time=0.032..0.032 rows=4 loops=1)
Recheck Cond: ((quantity = ANY ('{1,2,3,4}'::integer[])) OR (quantity IS NULL))
-> BitmapOr (cost=21.29..21.29 rows=4 width=0) (actual time=0.025..0.025 rows=0 loops=1)
-> Bitmap Index Scan on fruits_index (cost=0.00..17.03 rows=4 width=0) (actual time=0.019..0.019 rows=4 loops=1)
Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
-> Bitmap Index Scan on fruits_index (cost=0.00..4.26 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
Index Cond: (quantity IS NULL)
Total runtime: 0.089 ms
没有or
:
explain analyze
SELECT * FROM fruits WHERE quantity IN (1,2,3,4);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using fruits_index on fruits (cost=0.00..21.07 rows=4 width=11) (actual time=0.026..0.038 rows=4 loops=1)
Index Cond: (quantity = ANY ('{1,2,3,4}'::integer[]))
Total runtime: 0.085 ms
wildplasser提出的合并版本会导致顺序扫描:
explain analyze
SELECT *
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Seq Scan on fruits (cost=0.00..217.50 rows=250 width=11) (actual time=0.023..4.358 rows=4 loops=1)
Filter: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
Rows Removed by Filter: 9996
Total runtime: 4.395 ms
除非创建了合并表达式索引:
create index fruits_coalesce_index on fruits(coalesce(quantity, -1));
analyze fruits;
explain analyze
SELECT *
FROM fruits
WHERE COALESCE(quantity, -1) IN (-1,1,2,3,4);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Index Scan using fruits_coalesce_index on fruits (cost=0.00..25.34 rows=5 width=11) (actual time=0.112..0.124 rows=4 loops=1)
Index Cond: (COALESCE(quantity, (-1)) = ANY ('{-1,1,2,3,4}'::integer[]))
Total runtime: 0.172 ms
但它仍然比普通的or
查询更糟糕,因为它有一个普通的数量索引。
答案 2 :(得分:0)
这不是您确切问题的答案,但您可以为您的查询构建一个部分索引:
CREATE INDEX idx_partial (quantity) ON fruits
WHERE quantity IN (1,2,3,4) OR quantity IS NULL;
来自文档:http://www.postgresql.org/docs/current/interactive/indexes-partial.html
然后,您的查询应该使用此索引并加快速度。