Question

我有一个非常简单的SQL：

select * from email.email_task where acquire_time < now() and state IN ('CREATED', 'RELEASED') order by creation_time asc limit 1;

我创建了2个索引：

国家指数
州的索引，acquire_time，creation_time

理想情况下，我认为Postgres应该选择第二个，因为它匹配此SQL中所需的每个列：

但是执行计划显示不同，它不使用任何索引：

Limit  (cost=187404.36..187404.36 rows=1 width=743)
   ->  Sort  (cost=187404.36..190753.58 rows=1339690 width=743)
         Sort Key: creation_time
         ->  Seq Scan on email_task  (cost=0.00..180705.91 rows=1339690 width=743)
               Filter: (((state)::text = 'CREATED'::text) AND (acquire_time < now()))

据我所知，如果返回的行数达到总数的10％，那么它会选择Seq Scan over Index Scan。（如Why does PostgreSQL perform sequential scan on indexed column?所述）这就是为什么没有选择index1的原因。

我不明白为什么因为匹配所有列而未选择index2？

然后我尝试了第三个索引：

create_time，acquire_time，state

这次它使用index3（我使用另一个较小的数据库添加索引 perf_1因为原来的行有200万行而且需要花费太多时间）

Limit  (cost=0.29..0.36 rows=1 width=75) (actual time=0.043..0.043 rows=1 loops=1)
   ->  Index Scan using perf_1 on email_task  (cost=0.29..763.76 rows=9998 width=75) (actual time=0.042..0.042 rows=1 loops=1)
         Index Cond: (acquire_time < now())
         Filter: ((state)::text = ANY ('{CREATED,RELEASED}'::text[]))

看来，Postgres执行计划程序首先选择order by子句然后选择where子句，这有点违反直觉。

我的理解是否正确，还是有一些其他因素会影响Postgres规划师？

提前致谢。

0 个答案: