我在Centos 6.7上运行postgresql 9.4。其中一个表包含数百万条记录,这是DDL:
CREATE TABLE domain.examples (
id SERIAL,
sentence VARCHAR,
product_id BIGINT,
site_id INTEGER,
time_stamp BIGINT,
category_id INTEGER,
CONSTRAINT examples_pkey PRIMARY KEY(id)
)
WITH (oids = false);
CREATE INDEX examples_categories ON domain.examples
USING btree (category_id);
CREATE INDEX examples_site_idx ON domain.examples
USING btree (site_id);
使用分页的消费数据的应用程序使用分页,因此我们获取1000条记录的批量。但是,即使通过索引列获取,获取时间也非常慢:
explain analyze
select *
from domain.examples e
where e.category_id = 105154
order by id asc
limit 1000;
Limit (cost=0.57..331453.23 rows=1000 width=280) (actual time=2248261.276..2248296.600 rows=1000 loops=1)
-> Index Scan using examples_pkey on examples e (cost=0.57..486638470.34 rows=1468199 width=280) (actual time=2248261.269..2248293.705 rows=1000 loops=1)
Filter: (category_id = 105154)
Rows Removed by Filter: 173306740
Planning time: 70.821 ms
Execution time: 2248328.457 ms
导致查询速度慢的原因是什么?以及如何改进?
谢谢!
答案 0 :(得分:1)
这不是您想要的计划,postgresql正在扫描整个索引examples_pkey
并过滤出条件为category_id = 105154
的记录,您可以尝试使用ANALYZE
在表格上获得更好的统计信息或者使用系统GUC(我真的不建议)让计划者选择正确的索引。
或者,如果category_id = 105154
的行数太高,我建议先使用CTE,以便规划人员强制使用examples_categories
索引; < / p>
with favorite_category as (
select *
from domain.examples e
where e.category_id = 105154)
select *
from favorite_category
order by id asc
limit 1000;
这将使用category_id = 105154
获取所有记录并按内存排序(如果获取的大小小于工作内存,show work_mem;
以查看它是什么。默认为4MB)。
答案 1 :(得分:1)
您可以在category_id和id:
这两个字段上创建索引CREATE INDEX examples_site_idx2 ON domain.examples
USING btree (category_id, id);
我尝试用3,000,000行的查询解释分析。
使用旧索引:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..9234.56 rows=1000 width=60) (actual time=0.655..597.193 rows=322 loops=1)
-> Index Scan using examples_pkey on examples e (cost=0.43..138512.43 rows=15000 width=60) (actual time=0.654..597.142 rows=322 loops=1)
Filter: (category_id = 105154)
Rows Removed by Filter: 2999678
Planning time: 2.295 ms
Execution time: 597.257 ms
(6 rows)
使用新索引:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..2585.13 rows=1000 width=60) (actual time=0.027..28.814 rows=322 loops=1)
-> Index Scan using examples_site_idx2 on examples e (cost=0.43..38770.93 rows=15000 width=60) (actual time=0.026..28.777 rows=322 loops=1)
Index Cond: (category_id = 105154)
Planning time: 1.471 ms
Execution time: 28.860 ms
(5 rows)