更高成本的计划运行得更快

时间:2016-07-14 09:14:28

标签: sql postgresql postgresql-performance

我正在尝试从Markus Winand网站(Slow Indexes Part II

重现代码示例

测试用例:

  1. 创建一个未分析的表,并引入一个索引 工作慢于全表扫描。
  2. 使用该索引分析查询。
  3. 关闭索引扫描。
  4. 再次分析此查询。
  5. 我做了这个测试,数据库正在从INDEX SCAN切换到SEQ SCAN。

    但是查看实际数字,SEQ SCAN查询实际上比索引INDEX查询慢。

    为什么规划师会切换到SEQ SCAN查询,即使它更昂贵也更慢?

    以下是我的结果:

    drop table if exists employees;
    CREATE TABLE employees (
       employee_id   bigint         NOT NULL,
       subsidiary_id bigint not null,
       first_name    text NOT NULL,
       last_name     text NOT NULL,
       date_of_birth DATE           NOT NULL,
       phone_number  text NOT NULL,
       CONSTRAINT employees_pk PRIMARY KEY (subsidiary_id, employee_id)
    );
    
    SELECT setseed(0.5);
    
    insert into employees 
    select a, 1, random()::text,random()::text, now(), '123123123'
    from generate_series(1,200000) as t(a)
    union all 
    select a, 2, random()::text,random()::text, now(), '123123123'
    from generate_series(1,500000) as t(a);
    
    set enable_bitmapscan to false;
    explain analyze 
    select * 
    from employees 
    where subsidiary_id = 1 and first_name = '0.550025727134198';
    
    "QUERY PLAN"
    "Index Scan using employees_pk on employees  (cost=0.42..8596.82 rows=12 width=116) (actual time=0.024..38.409 rows=1 loops=1)"
    "  Index Cond: (subsidiary_id = 1)"
    "  Filter: (first_name = '0.550025727134198'::text)"
    "  Rows Removed by Filter: 199999"
    "Planning time: 0.114 ms"
    "Execution time: 38.429 ms"
    
    analyze employees;
    
    explain analyze 
    select * 
    from employees 
    where subsidiary_id = 1 and first_name = '0.550025727134198';
    
    "QUERY PLAN"
    "Seq Scan on employees  (cost=0.00..19142.00 rows=1 width=66) (actual time=0.017..66.579 rows=1 loops=1)"
    "  Filter: ((subsidiary_id = 1) AND (first_name = '0.550025727134198'::text))"
    "  Rows Removed by Filter: 699999"
    "Planning time: 0.431 ms"
    "Execution time: 66.601 ms"
    
    set enable_seqscan to false;
    
    explain analyze 
    select * 
    from employees 
    where subsidiary_id = 1 and first_name = '0.550025727134198';
    
    "QUERY PLAN"
    "Index Scan using employees_pk on employees  (cost=0.42..23697.20 rows=1 width=66) (actual time=0.041..36.159 rows=1 loops=1)"
    "  Index Cond: (subsidiary_id = 1)"
    "  Filter: (first_name = '0.550025727134198'::text)"
    "  Rows Removed by Filter: 199999"
    "Planning time: 0.061 ms"
    "Execution time: 36.178 ms"
    

1 个答案:

答案 0 :(得分:0)

规划人员没有使用数据位置进行计算 - 它希望每次都能从光盘读取数据。默认情况下,索引有点不利 - 如果你有足够的RAM用于文件系统缓存或你的IO没有被利用,那么你可以找到更昂贵的更快的查询。但PostgreSQL规划器配置为使用光盘和低缓存 - 它确实有效。可能你没有加载就在服务器上进行过测试。尝试在服务器上模拟更逼真的负载,并尝试重复测试。