PostgreSQL 9.4
Index Only Scan
和Index Scan
之间的区别在于,在第一种情况下,我们不必读取包含indexNode
指向获取结果的行的实际页面。我们从索引节点直接获取它,因此避免浪费时间和资源来执行额外的随机访问(到实际页面)并检查行的可见性。
现在,我有两张桌子:
CREATE TABLE bar ( c1 integer, c2 text )
CREATE TABLE foo ( c1 integer, c2 text )
与
类似地填充INSERT INTO foo
SELECT i, md5(random()::text)
FROM generate_series(1, 1000000) AS i;
现在,在使用表之前,我运行了ANALYZE foo
和ANALYZE bar
,然后尝试执行
EXPLAIN ANALYZE SELECT foo.c1
FROM foo foo
LEFT OUTER JOIN bar bar
ON foo.c1 = bar.c1
计划员给了我以下结果
Merge Left Join (cost=0.63..83633.47 rows=1000000 width=4) (actual time=0.018..731.525 rows=1000000 loops=1)
Merge Cond: (foo.c1 = bar.c1)
-> Index Only Scan using foo_c1_idx on foo (cost=0.00..34317.36 rows=1000000 width=4) (actual time=0.009..216.308 rows=1000000 loops=1)
Heap Fetches: 1000000
-> Index Only Scan using bar_c1_idx on bar (cost=0.00..34317.36 rows=1000000 width=4) (actual time=0.006..233.899 rows=1000000 loops=1)
Heap Fetches: 1000000
请注意,该节点为Index Only Scan
。但如果我试试这个:
EXPLAIN ANALYZE SELECT *
FROM foo foo
LEFT OUTER JOIN bar bar
ON foo.c1 = bar.c1
计划将是
Merge Left Join (cost=0.63..83633.47 rows=1000000 width=74) (actual time=0.014..738.559 rows=1000000 loops=1)
Merge Cond: (foo.c1 = bar.c1)
-> Index Scan using foo_c1_idx on foo (cost=0.00..34317.36 rows=1000000 width=37) (actual time=0.007..194.479 rows=1000000 loops=1)
-> Index Scan using bar_c1_idx on bar (cost=0.00..34317.36 rows=1000000 width=37) (actual time=0.004..210.758 rows=1000000 loops=1)
Total runtime: 758.261 ms
Index Scan
和Index Only Scan
的费用相同,但在我的情况下,使用Index Only Scan
执行的查询比最后一次查询快6倍。
问题: 为什么规划师会产生如此奇怪的代价?我跑了ANALYZE以确保我有关于表格的最新统计数据。