我使用Ubuntu 16.04和PostgreSQL 9.5以及Django 1.11
我的网站遭受了超长的ajax呼叫(在某些情况下超过30秒)。相同的ajax调用需要大约500ms的开发时间。
问题与磁盘读取I / O有关。在生产中执行单个查询会驱动磁盘读取I / O up to 25MB/s;开发中的相同的查询导致磁盘读取I / O小于0.01 MB / s。代码和查询在生产/开发中是相同的。
因此,生产中的postgres会导致异常高的磁盘读取I / O.它可能是什么?
这是一个示例查询,在生产中需要大约25秒,在开发中只需要500毫秒:
EXPLAIN (ANALYZE, BUFFERS)
SELECT COUNT(*) AS "__count" FROM "map_listing"
WHERE ("map_listing"."lo" < -79.32516245458987 AND "map_listing"."la" > 43.640279060122346
AND "map_listing"."lo" > -79.60531382177737 AND "map_listing"."transaction_type" = 'Sale'
AND "map_listing"."la" < 43.774544561921296
AND NOT ("map_listing"."status" = 'Sld' AND "map_listing"."sold_date" < '2018-01-21'::date
AND "map_listing"."sold_date" IS NOT NULL)
AND NOT (("map_listing"."status" = 'Ter' OR "map_listing"."status" = 'Exp'))
AND NOT (("map_listing"."property_type" = 'Parking Space' OR "map_listing"."property_type" = 'Locker')));
对上述声明(制作)执行EXPLAIN (ANALYZE, BUFFERS)
的结果:
Aggregate (cost=89924.55..89924.56 rows=1 width=0) (actual time=27318.859..27318.860 rows=1 loops=1)
Buffers: shared read=73424
-> Bitmap Heap Scan on map_listing (cost=4873.96..89836.85 rows=35079 width=0) (actual time=6061.214..27315.183 rows=3228 loops=1)
Recheck Cond: ((la > 43.640279060122346) AND (la < 43.774544561921296))
Rows Removed by Index Recheck: 86733
Filter: ((lo < '-79.32516245458987'::numeric) AND (lo > '-79.60531382177737'::numeric) AND ((status)::text <> 'Ter'::text) AND ((status)::text <> 'Exp'::text) AND ((property_type)::text <> 'Parking Space'::text) AND ((property_type)::text <> 'Locker'::text) AND ((transaction_type)::text = 'Sale'::text) AND (((status)::text <> 'Sld'::text) OR (sold_date >= '2018-01-21'::date) OR (sold_date IS NULL)))
Rows Removed by Filter: 190108
Heap Blocks: exact=46091 lossy=26592
Buffers: shared read=73424
-> Bitmap Index Scan on map_listing_la_88ca396c (cost=0.00..4865.19 rows=192477 width=0) (actual time=156.964..156.964 rows=194434 loops=1)
Index Cond: ((la > 43.640279060122346) AND (la < 43.774544561921296))
Buffers: shared read=741
Planning time: 0.546 ms
Execution time: 27318.926 ms
(14 rows)
EXPLAIN (ANALYZE, BUFFERS)
(开发)的结果:
Aggregate (cost=95326.23..95326.24 rows=1 width=8) (actual time=495.373..495.373 rows=1 loops=1)
Buffers: shared read=77281
-> Bitmap Heap Scan on map_listing (cost=5211.98..95225.57 rows=40265 width=0) (actual time=80.929..495.140 rows=4565 loops=1)
Recheck Cond: ((la > 43.640279060122346) AND (la < 43.774544561921296))
Rows Removed by Index Recheck: 85958
Filter: ((lo < '-79.32516245458987'::numeric) AND (lo > '-79.60531382177737'::numeric) AND ((status)::text <> 'Ter'::text) AND ((status)::text <> 'Exp'::text) AND ((property_type)::text <> 'P
arking Space'::text) AND ((property_type)::text <> 'Locker'::text) AND ((transaction_type)::text = 'Sale'::text) AND (((status)::text <> 'Sld'::text) OR (sold_date >= '2018-01-21'::date) OR (sold_date
IS NULL)))
Rows Removed by Filter: 198033
Heap Blocks: exact=49858 lossy=26639
Buffers: shared read=77281
-> Bitmap Index Scan on map_listing_la_88ca396c (cost=0.00..5201.91 rows=205749 width=0) (actual time=73.070..73.070 rows=205569 loops=1)
Index Cond: ((la > 43.640279060122346) AND (la < 43.774544561921296))
Buffers: shared read=784
Planning time: 0.962 ms
Execution time: 495.822 ms
(14 rows)
答案 0 :(得分:2)
此查询未生成任何磁盘I / O - 所有块都从共享缓冲区中读取。但是由于查询读取73424个块(大约574 MB),因此在未缓存表时会产生大量的I / O负载。
但有两件事可以改进。
堆扫描中有有损块匹配。这意味着work_mem
不足以包含每个表行一位的位图,而26592位则映射表格块。必须重新检查所有行,并丢弃86733行,其中大多数是有损块匹配的误报。
如果增加work_mem
,每个表行一位的位图将适合内存,这个数字会缩小,从而减少堆扫描期间的工作。
190108行被丢弃,因为它们与位图堆扫描中的附加过滤条件不匹配。这可能是花费大部分时间的地方。如果你能减少这笔金额,你就会赢。
此查询的理想索引是:
CREATE INDEX ON map_listing(transaction_type, la);
CREATE INDEX ON map_listing(transaction_type, lo);
如果transaction_type
不是非常有选择性(即大多数行的值为Sale
),则可以省略该列。
修改强>
对vmstat
和iostat
的检查表明,CPU和I / O子系统都遭受了大量过载:所有CPU资源都花在了I / O等待和VM窃取时间上。您需要一个更好的I / O系统和一个拥有更多可用CPU资源的主机系统。增加RAM migjt可以缓解I / O问题,但仅限于磁盘读取。
答案 1 :(得分:1)
(我还没有权利发表评论)
我目前遇到类似杰克的问题。创建索引后,我的查询速度变慢,而且我对work_mem和shared_buffers的调整没有任何改进。
当你说RAM是问题时,你做了什么来解决它?我的服务器是32GB RAM,我甚至尝试过设置work_mem = 16GB。
iotop读到:
DISK READ DISK WRITE SWAPIN IO> COMMAND
86.28 M/s 0.00 B/s 0.00 % 87.78 % postgres