我正在尝试从Markus Winand网站(Slow Indexes Part II)
重现代码示例测试用例:
我做了这个测试,数据库正在从INDEX SCAN切换到SEQ SCAN。
但是查看实际数字,SEQ SCAN查询实际上比索引INDEX查询慢。
为什么规划师会切换到SEQ SCAN查询,即使它更昂贵也更慢?
以下是我的结果:
drop table if exists employees;
CREATE TABLE employees (
employee_id bigint NOT NULL,
subsidiary_id bigint not null,
first_name text NOT NULL,
last_name text NOT NULL,
date_of_birth DATE NOT NULL,
phone_number text NOT NULL,
CONSTRAINT employees_pk PRIMARY KEY (subsidiary_id, employee_id)
);
SELECT setseed(0.5);
insert into employees
select a, 1, random()::text,random()::text, now(), '123123123'
from generate_series(1,200000) as t(a)
union all
select a, 2, random()::text,random()::text, now(), '123123123'
from generate_series(1,500000) as t(a);
set enable_bitmapscan to false;
explain analyze
select *
from employees
where subsidiary_id = 1 and first_name = '0.550025727134198';
"QUERY PLAN"
"Index Scan using employees_pk on employees (cost=0.42..8596.82 rows=12 width=116) (actual time=0.024..38.409 rows=1 loops=1)"
" Index Cond: (subsidiary_id = 1)"
" Filter: (first_name = '0.550025727134198'::text)"
" Rows Removed by Filter: 199999"
"Planning time: 0.114 ms"
"Execution time: 38.429 ms"
analyze employees;
explain analyze
select *
from employees
where subsidiary_id = 1 and first_name = '0.550025727134198';
"QUERY PLAN"
"Seq Scan on employees (cost=0.00..19142.00 rows=1 width=66) (actual time=0.017..66.579 rows=1 loops=1)"
" Filter: ((subsidiary_id = 1) AND (first_name = '0.550025727134198'::text))"
" Rows Removed by Filter: 699999"
"Planning time: 0.431 ms"
"Execution time: 66.601 ms"
set enable_seqscan to false;
explain analyze
select *
from employees
where subsidiary_id = 1 and first_name = '0.550025727134198';
"QUERY PLAN"
"Index Scan using employees_pk on employees (cost=0.42..23697.20 rows=1 width=66) (actual time=0.041..36.159 rows=1 loops=1)"
" Index Cond: (subsidiary_id = 1)"
" Filter: (first_name = '0.550025727134198'::text)"
" Rows Removed by Filter: 199999"
"Planning time: 0.061 ms"
"Execution time: 36.178 ms"
答案 0 :(得分:0)
规划人员没有使用数据位置进行计算 - 它希望每次都能从光盘读取数据。默认情况下,索引有点不利 - 如果你有足够的RAM用于文件系统缓存或你的IO没有被利用,那么你可以找到更昂贵的更快的查询。但PostgreSQL规划器配置为使用光盘和低缓存 - 它确实有效。可能你没有加载就在服务器上进行过测试。尝试在服务器上模拟更逼真的负载,并尝试重复测试。