我有一个大约有1000万行的表和一个日期字段的索引。当我尝试提取索引字段的唯一值时,即使结果集只有26个项目,Postgres也会运行顺序扫描。为什么优化人员会选择这个计划?我能做些什么来避免它?
explain select "labelDate" from pages group by "labelDate";
QUERY PLAN
-----------------------------------------------------------------------
HashAggregate (cost=524616.78..524617.04 rows=26 width=4)
Group Key: "labelDate"
-> Seq Scan on pages (cost=0.00..499082.42 rows=10213742 width=4)
(3 rows)
答案 0 :(得分:1)
I think your problem here is that the query planner wants to read the whole table because you have a GROUP BY
clause even though you do not use any aggregate function. It therefore looks similar to the issue of "Why is count(*) so slow" which you will find in many forms in postgresql questions.
In your case, the query is a bit odd. Your question is answered with this simple query:
SELECT DISTINCT "labelDate" FROM pages;