那里有一些问题说postgres没有使用order by,但我的情况是错误地使用了。
无索引排序 - 结果缓存后热运行。需要8.48秒
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246372.98..246622.98 rows=100000 width=72) (actual time=8451.119..8479.138 rows=100000 loops=1)
Buffers: shared hit=16134 read=35121
-> Sort (cost=246372.98..251348.03 rows=1990021 width=72) (actual time=8451.117..8467.403 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=16134 read=35121
-> Seq Scan on users (cost=0.00..71155.21 rows=1990021 width=72) (actual time=25.448..7782.830 rows=1995958 loops=1)
Buffers: shared hit=16134 read=35121
Planning time: 40.542 ms
Execution time: 8487.556 ms
(10 rows)
使用userid列上的索引进行排序。使用更多磁盘I / O并占用高达6.2分钟
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12771.83 rows=100000 width=72) (actual time=35.498..372437.748 rows=100000 loops=1)
Buffers: shared hit=60846 read=39425
-> Index Scan using users_userid_idx on users (cost=0.43..255288.96 rows=1998907 width=72) (actual time=35.496..372372.192 rows=100000 loops=1)
Buffers: shared hit=60846 read=39425
Planning time: 0.160 ms
Execution time: 372476.536 ms
(6 rows)
很少有事情需要注意
我的问题不是改善秩序,而是要理解规划师错误估计的原因。在写这个问题的那一刻,我在postgres 9.4上运行了我的Mac OSx上的这些查询。我没有任何其他具有不同操作系统的机器此刻要测试,也许很快就会生病。
其他人是否可以确认这是否是规划师的错误,或者我的机器有问题。
答案 0 :(得分:1)
我对实际发生的事情感到非常难过。在我完成以下步骤之后,这是新的统计数据。
在我这样做之后,这是新的统计数据。
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12788.49 rows=100000 width=72) (actual time=0.031..78.785 rows=100000 loops=1)
Buffers: shared hit=100271
-> Index Scan using users_userid_idx on users (cost=0.43..255244.73 rows=1995958 width=72) (actual time=0.030..65.937 rows=100000 loops=1)
Buffers: shared hit=100271
Planning time: 0.119 ms
Execution time: 84.985 ms
(6 rows)
唯一的变化是没有磁盘I / O,因为所有内容都被缓存,可能是因为增加了共享缓冲区。但实际时间变化超出了逻辑。
没有指数的正常的前N个堆也有所改善。
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246955.09..247205.09 rows=100000 width=72) (actual time=707.350..734.954 rows=100000 loops=1)
Buffers: shared hit=26071 read=25184
-> Sort (cost=246955.09..251944.99 rows=1995958 width=72) (actual time=707.348..723.127 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=26071 read=25184
-> Seq Scan on users (cost=0.00..71214.58 rows=1995958 width=72) (actual time=9.922..270.684 rows=1995958 loops=1)
Buffers: shared hit=26071 read=25184
Planning time: 0.090 ms
Execution time: 743.788 ms
(10 rows)
将共享缓冲区更改回128 MB,结果仍然很好。
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12788.49 rows=100000 width=72) (actual time=0.098..232.314 rows=100000 loops=1)
Buffers: shared hit=61313 read=38958
-> Index Scan using users_userid_idx on users (cost=0.43..255244.73 rows=1995958 width=72) (actual time=0.096..218.272 rows=100000 loops=1)
Buffers: shared hit=61313 read=38958
Planning time: 0.131 ms
Execution time: 238.861 ms
(6 rows)
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246955.09..247205.09 rows=100000 width=72) (actual time=722.003..749.696 rows=100000 loops=1)
Buffers: shared hit=16192 read=35063
-> Sort (cost=246955.09..251944.99 rows=1995958 width=72) (actual time=722.001..737.715 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=16192 read=35063
-> Seq Scan on users (cost=0.00..71214.58 rows=1995958 width=72) (actual time=8.584..294.605 rows=1995958 loops=1)
Buffers: shared hit=16192 read=35063
Planning time: 0.070 ms
Execution time: 757.495 ms
(10 rows)
我听说有人说不要在Mac /台式机上取得计时结果,但这完全是疯了。