Question

请看一个简单的例子：

=> create table t1 ( a int, b int, c int );
CREATE TABLE

=> insert into t1 select a, a, a from generate_series(1,100) a;
INSERT 0 100

=> create index i1 on t1(b);
CREATE INDEX

=> vacuum t1;
VACUUM

=> explain analyze select b from t1 where b = 10;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Seq Scan on t1  (cost=0.00..2.25 rows=1 width=4) (actual time=0.016..0.035 rows=1 loops=1)
   Filter: (b = 10)
   Rows Removed by Filter: 99
 Planning Time: 0.082 ms
 Execution Time: 0.051 ms
(5 rows)

您会看到我选择b并仅对b进行查询。并手动vacuum t1;以确保将可见性信息存储在索引中。

但是为什么Postgresql仍然进行Seq扫描而不是仅索引扫描？非常感谢。

已编辑：

添加更多行后，它将执行仅索引扫描：

=> insert into t1 select a, a, a from generate_series(1,2000) a;

=> vacuum t1;

=> explain analyze select b from t1 where b = 10;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Index Only Scan using i1 on t1  (cost=0.28..4.45 rows=10 width=4) (actual time=0.038..0.039 rows=1 loops=1)
   Index Cond: (b = 10)
   Heap Fetches: 0
 Planning Time: 0.186 ms
 Execution Time: 0.058 ms
(5 rows)

当行数很小时，似乎PostgreSQL不喜欢仅索引扫描。

Answer 1

由于没有人愿意提供详细的说明，因此我将在此处写一个简单的答案。

来自@a_horse_with_no_name：

100个行将适合单个数据块，因此执行seq扫描将仅需要单个I / O操作，而仅索引扫描将需要相同的操作。使用explain (analyze, buffers)查看查询所需的块（=缓冲区）的更多详细信息

来自https://www.postgresql.org/docs/current/indexes-examine.html：

使用非常小的测试数据集尤其致命。虽然从100000行中选择1000行可能是索引的候选对象，但在100行中选择1行几乎是不可能的，因为100行可能适合单个磁盘页面，并且没有计划可以胜任依次获取1个磁盘页面的计划。

当所有内容都在索引中而不是仅索引扫描时，为什么Postgresql进行Seq扫描？

1 个答案: