Question

为什么：

我们正在与同事的复合键进行讨论。事实上，Mysql需要在where子句中顺序使用索引列而没有间隙供计划者使用。我想展示Postgres如何在复合键中使用第二列进行扫描。我失败了！它使用的是第一列，但不是第二列！完全混淆我玩了一些，发现当索引比表格小6.5倍时，它开始使用第二列：

填入：

create table so2 (a int not null,b int not null, c text, d int not null);
with l as (select generate_series(999,999+76,1) r) 
insert into so2
select l.r,l.r+1,concat('l',lpad('o',l.r,'o'),'ng'),1 from l;
;
alter table so2 ADD CONSTRAINT so2pk PRIMARY KEY (a,b);
analyze so2;

困扰我的计划：

t=# explain analyze select 42 from so2 where a=1004;
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Only Scan using so2pk on so2  (cost=0.14..8.16 rows=1 width=0) (actual time=0.013..0.013 rows=1 loops=1)
   Index Cond: (a = 1004)
   Heap Fetches: 1
 Planning time: 0.090 ms
 Execution time: 0.026 ms
(5 rows)

t=# explain analyze select 42 from so2 where b=1004;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Seq Scan on so2  (cost=0.00..11.96 rows=1 width=0) (actual time=0.006..0.028 rows=1 loops=1)
   Filter: (b = 1004)
   Rows Removed by Filter: 76
 Planning time: 0.045 ms
 Execution time: 0.036 ms
(5 rows)

然后我放弃so2并重新运行使用999+77准备部分，而不是999+76并计划b列更改：

t=# explain analyze select 42 from so2 where b=1004;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Only Scan using so2pk on so2  (cost=0.14..12.74 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=1)
   Index Cond: (b = 1004)
   Heap Fetches: 1
 Planning time: 0.038 ms
 Execution time: 0.013 ms
(5 rows)

我注意到的唯一区别是关系需要的页数：

令人困惑的计划＆＃39;大小

t=# \dt+ so2
                  List of relations
 Schema | Name | Type  | Owner |  Size  | Description
--------+------+-------+-------+--------+-------------
 public | so2  | table | vao   | 120 kB |
(1 row)

预期的大小：

t=# \dt+ so2
                  List of relations
 Schema | Name | Type  | Owner |  Size  | Description
--------+------+-------+-------+--------+-------------
 public | so2  | table | vao   | 128 kB |
(1 row)

两种情况下的指数相同：

t=# \di+ so2pk
                      List of relations
 Schema | Name  | Type  | Owner | Table | Size  | Description
--------+-------+-------+-------+-------+-------+-------------
 public | so2pk | index | vao   | so2   | 16 kB |
(1 row)

可能影响计划的设置是默认设置：

select name,setting
from pg_settings
where source != 'default' and name in (
'enable_bitmapscan',
'enable_hashagg',
'enable_hashjoin',
'enable_indexscan',
'enable_indexonlyscan',
'enable_material',
'enable_mergejoin',
'enable_nestloop',
'enable_seqscan',
'enable_sort',
'enable_tidscan',
'seq_page_cost',
'random_page_cost',
'cpu_tuple_cost',
'cpu_index_tuple_cost',
'cpu_operator_cost',
'effective_cache_size',
'geqo',
'geqo_threshold',
'geqo_effort',
'geqo_pool_size',
'geqo_generations',
'geqo_selection_bias',
'geqo_seed',
'join_collapse_limit',
'from_collapse_limit',
'cursor_tuple_fraction',
'constraint_exclusion',
'default_statistics_target'
) order by name
;
 name | setting
------+---------
(0 rows)

试用了几个版本：9.3.10,9.5.4具有相同的行为

现在 - 请原谅我这么长的帖子！问题：

16kB小于120kB - 为什么规划师会选择Seq Scan？..

更新以反映e4c5肯定言论

此外：有一秒钟，我认为可能是因为文本列保留在扩展的句子中，所以表本身占用的页面数量与索引相同（所有列只有文本一行），因此我将其更改为main和plain - 没有效果......

Answer 1

我认为这里的关键区别在于mysql每个表只能使用一个索引，因为postgresql没有这个限制。可以使用多个索引，这可能是they say

的原因

应谨慎使用多列索引。在大多数情况下，单列上的索引就足够了，节省了空间和时间。除非有以下情况，否则超过三列的索引不太可能有用表的用法非常风格化。

其他一些指示

1）您的数据太少，无法得出任何结论。是的，查询计划在很大程度上取决于大小 - 表中的行数。第一个存储的函数只创建了62行，为此您不需要索引。

2）您搜索a = 4的值不在表中。

3）b的值始终为1，因此该列的索引将无用。即使是综合指数也不会给它非常高的基数。即（a，b）上的复合索引与

上的索引完全相同

更新

在行数的沙子阈值中没有行，或者以KB为单位的索引生效。查询计划程序根据https://www.postgresql.org/docs/9.2/static/runtime-config-query.html

中描述的几个配置因素决定是否使用索引和要使用的索引

当Postgres开始检查合适的索引时？

为什么：

填入：

困扰我的计划：

可能影响计划的设置是默认设置：

现在 - 请原谅我这么长的帖子！问题：

1 个答案: