Question

我正在尝试从Markus Winand网站（Slow Indexes Part II）

重现代码示例

测试用例：

创建一个未分析的表，并引入一个索引工作慢于全表扫描。
使用该索引分析查询。
关闭索引扫描。
再次分析此查询。

我做了这个测试，数据库正在从INDEX SCAN切换到SEQ SCAN。

但是查看实际数字，SEQ SCAN查询实际上比索引INDEX查询慢。

为什么规划师会切换到SEQ SCAN查询，即使它更昂贵也更慢？

以下是我的结果：

drop table if exists employees;
CREATE TABLE employees (
   employee_id   bigint         NOT NULL,
   subsidiary_id bigint not null,
   first_name    text NOT NULL,
   last_name     text NOT NULL,
   date_of_birth DATE           NOT NULL,
   phone_number  text NOT NULL,
   CONSTRAINT employees_pk PRIMARY KEY (subsidiary_id, employee_id)
);

SELECT setseed(0.5);

insert into employees 
select a, 1, random()::text,random()::text, now(), '123123123'
from generate_series(1,200000) as t(a)
union all 
select a, 2, random()::text,random()::text, now(), '123123123'
from generate_series(1,500000) as t(a);

set enable_bitmapscan to false;
explain analyze 
select * 
from employees 
where subsidiary_id = 1 and first_name = '0.550025727134198';

"QUERY PLAN"
"Index Scan using employees_pk on employees  (cost=0.42..8596.82 rows=12 width=116) (actual time=0.024..38.409 rows=1 loops=1)"
"  Index Cond: (subsidiary_id = 1)"
"  Filter: (first_name = '0.550025727134198'::text)"
"  Rows Removed by Filter: 199999"
"Planning time: 0.114 ms"
"Execution time: 38.429 ms"

analyze employees;

explain analyze 
select * 
from employees 
where subsidiary_id = 1 and first_name = '0.550025727134198';

"QUERY PLAN"
"Seq Scan on employees  (cost=0.00..19142.00 rows=1 width=66) (actual time=0.017..66.579 rows=1 loops=1)"
"  Filter: ((subsidiary_id = 1) AND (first_name = '0.550025727134198'::text))"
"  Rows Removed by Filter: 699999"
"Planning time: 0.431 ms"
"Execution time: 66.601 ms"

set enable_seqscan to false;

explain analyze 
select * 
from employees 
where subsidiary_id = 1 and first_name = '0.550025727134198';

"QUERY PLAN"
"Index Scan using employees_pk on employees  (cost=0.42..23697.20 rows=1 width=66) (actual time=0.041..36.159 rows=1 loops=1)"
"  Index Cond: (subsidiary_id = 1)"
"  Filter: (first_name = '0.550025727134198'::text)"
"  Rows Removed by Filter: 199999"
"Planning time: 0.061 ms"
"Execution time: 36.178 ms"

Answer 1

规划人员没有使用数据位置进行计算 - 它希望每次都能从光盘读取数据。默认情况下，索引有点不利 - 如果你有足够的RAM用于文件系统缓存或你的IO没有被利用，那么你可以找到更昂贵的更快的查询。但PostgreSQL规划器配置为使用光盘和低缓存 - 它确实有效。可能你没有加载就在服务器上进行过测试。尝试在服务器上模拟更逼真的负载，并尝试重复测试。

更高成本的计划运行得更快

1 个答案: