https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT
我目前正在阅读此页面以了解postgreSQL的EXPLAIN ANALYZE,并且我试图了解估算成本与实际时间之间的关系。
本页面给出的一个简单示例如下:
-> Nested Loop (cost=5.64..14.71 rows=1 width=140) (actual time=18.983..19.481 rows=4 loops=1)
-> Hash Join (cost=5.64..8.82 rows=1 width=72) (actual time=18.876..19.212 rows=4 loops=1)
-> Index Scan using pg_class_oid_index on pg_class i (cost=0.00..5.88 rows=1 width=72) (actual time=0.051..0.055 rows=1 loops=4)
它说"如果你进行数学计算,你会发现0.055 * 4占据了散列连接总时间和嵌套循环总时间之间差异的大部分(剩余部分可能是衡量所有这一切的开销) )"
我不确定"差异"这里代表我并不能找到接近0.055 * 4的任何差异..我是愚蠢的,只是忽略了一些微不足道的结果?
顺便说一下,我实际上正在编写关于数据库的实验报告,所以一般来说,如果被要求根据某些具体结果写下关于估计和实际时间的简短评论,我能说什么呢?
这是我需要编写结果的计划:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=39911.52..299300.41 rows=1 width=17) (actual time=4660.217..4952.328 rows=1 loops=1)
Join Filter: (casts.mid = movie.id)
Rows Removed by Join Filter: 2251735
-> Seq Scan on movie (cost=0.00..29721.64 rows=5542 width=21) (actual time=0.637..316.651 rows=4201 loops=1)
Filter: (year > 2010)
Rows Removed by Filter: 1533210
-> Materialize (cost=39911.52..269080.01 rows=6 width=4) (actual time=0.307..1.014 rows=536 loops=4201)
-> Hash Join (cost=39911.52..269079.98 rows=6 width=4) (actual time=1288.827..4089.872 rows=536 loops=1)
Hash Cond: (casts.pid = actor.id)
-> Seq Scan on casts (cost=0.00..186246.47 rows=11445847 width=8) (actual time=0.293..1487.138 rows=11445847 loops=1)
-> Hash (cost=39911.51..39911.51 rows=1 width=4) (actual time=414.130..414.130 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on actor (cost=0.00..39911.51 rows=1 width=4) (actual time=100.175..414.125 rows=1 loops=1)
Filter: (((fname)::text = 'Tom'::text) AND ((lname)::text = 'Hanks'::text))
Rows Removed by Filter: 1865033
Total runtime: 4952.822 ms
答案 0 :(得分:2)
看实际时间:
-> Nested Loop ........ (actual time=18.983..19.481 rows=4 loops=1) ..... ..... -> Hash Join ....... (actual time=18.876..19.212 rows=4 loops=1) -> Index Scan ......... (actual time=0.051..0.055 rows=1 loops=4)
4(循环)* 0.055 = 0.22
19.212 + 0.22 = 19.432 ==>差不多19.481(缺少0.049)
修改强>
我认为在actor( fname + lname )
添加索引,
甚至只在一列actor( lname )
上,可以大大加快这个问题。
看看这个:
-> Seq Scan on actor (cost=0.00..39911.51 rows=1 width=4) (actual time=100.175..414.125 rows=1 loops=1)
Filter: (((fname)::text = 'Tom'::text) AND ((lname)::text = 'Hanks'::text))
Rows Removed by Filter: 1865033
PostgreSQL在actos
表上执行顺序扫描,并过滤掉1865033行以查找仅1行。扫描的总时间为100到414秒
使用索引时,可以在几毫秒内找到一行。