Postgres DB:生产中的SQL语句比本地花费的时间更长

时间:2019-04-14 18:38:08

标签: sql postgresql

我在生产环境的Heroku和本地的Postgres 11.2上运行Postgres 11.2。

我有一个特定的查询正在生产中超时。

这是查询的示例:

SELECT u0.id, u0.app, u0.email, u0.gender, u0.fcm_token,
u0.flagged_user_ids, u0.flags, u0.gems_balance, u0.hearts_balance,
u0.hearts_received, u0.last_hearts_refreshed, u0.looking, u0.matches,
u0.meta_tags, u0.name, u0.online, u0.profile_image, u0.purchase_count,
u0.recents_history, u0.inserted_at, u0.updated_at
FROM users AS u0 WHERE ((u0.email != 'ready234@yahoo.com') AND (u0.looking = TRUE))
AND (NOT (u0.email IN ('charlie_pevijom_angel@tfbnw.net','indzdugsyh_1487230198@tfbnw.net','iznngyqvfh_1484692514@tfbnw.net','android@test.com','ios@test.com','ios@banana.com','android@banana.com','justinb@banana.com','herb.perv@banana.com')))
AND (NOT ('apple_review' = ANY(u0.meta_tags)))
AND (NOT ('ready234@yahoo.com' = ANY(u0.matches)))
AND (NOT (584065 = ANY(u0.flagged_user_ids)))
ORDER BY u0.updated_at LIMIT 1;

当我在本地下载生产数据库pg_dump时,进入SQL控制台并在本地运行查询,这是非常快的。

生产: Time: 46307.569 ms (00:46.308) 46秒!

本地: Time: 176.106 ms <1秒!

我已验证该表的本地副本与生产中的副本相同。

生产:

Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "users_app_index" btree (app)
    "users_email_index" btree (email)
    "users_matches_index" btree (matches)
Referenced by:
    TABLE "chat_users" CONSTRAINT "chat_users_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)
    TABLE "messages" CONSTRAINT "messages_from_id_fkey" FOREIGN KEY (from_id) REFERENCES users(id)

本地:

Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "users_app_index" btree (app)
    "users_email_index" btree (email)
    "users_matches_index" btree (matches)
Referenced by:
    TABLE "chat_users" CONSTRAINT "chat_users_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)
    TABLE "messages" CONSTRAINT "messages_from_id_fkey" FOREIGN KEY (from_id) REFERENCES users(id)

我不明白为什么生产中的那个花了这么长时间。有办法调试吗?

更新:

根据要求,这是两者的执行计划:

生产:

QUERY PLAN                                                                                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=443802.88..443802.88 rows=1 width=834) (actual time=40956.176..40956.176 rows=0 loops=1)
   Buffers: shared hit=5074 read=431010 dirtied=20
   I/O Timings: read=39940.500
   ->  Sort  (cost=443802.88..443802.88 rows=1 width=834) (actual time=40956.174..40956.175 rows=0 loops=1)
         Sort Key: updated_at
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=5074 read=431010 dirtied=20
         I/O Timings: read=39940.500
         ->  Seq Scan on users u0  (cost=0.00..443802.88 rows=1 width=834) (actual time=40956.166..40956.166 rows=0 loops=1)
               Filter: (looking AND ((email)::text <> 'ready234@yahoo.com'::text) AND ((email)::text <> ALL ('{charlie_pevijom_angel@tfbnw.net,indzdugsyh_1487230198@tfbnw.net,iznngyqvfh_1484692514@tfbnw.net,android@test.com,ios@test.com,ios@banana.com,android@banana.com,justinb@banana.com,herb.perv@banana.com}'::text[])) AND ('apple_review'::text <> ALL ((meta_tags)::text[])) AND ('ready234@yahoo.com'::text <> ALL ((matches)::text[])) AND (584065 <> ALL (flagged_user_ids)))
               Rows Removed by Filter: 583206
               Buffers: shared hit=5074 read=431010 dirtied=20
               I/O Timings: read=39940.500
 Planning Time: 0.259 ms
 Execution Time: 40956.221 ms
(15 rows)

本地:

QUERY PLAN                                                                                                                                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=78520.41..78520.41 rows=1 width=827) (actual time=164.434..164.434 rows=0 loops=1)
   Buffers: shared hit=11616 read=51044
   ->  Sort  (cost=78520.41..78520.41 rows=1 width=827) (actual time=164.433..164.433 rows=0 loops=1)
         Sort Key: updated_at
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=11616 read=51044
         ->  Gather  (cost=1000.00..78520.40 rows=1 width=827) (actual time=164.414..167.228 rows=0 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               Buffers: shared hit=11616 read=51044
               ->  Parallel Seq Scan on users u0  (cost=0.00..77520.30 rows=1 width=827) (actual time=159.986..159.986 rows=0 loops=3)
                     Filter: (looking AND ((email)::text <> 'ready234@yahoo.com'::text) AND ((email)::text <> ALL ('{charlie_pevijom_angel@tfbnw.net,indzdugsyh_1487230198@tfbnw.net,iznngyqvfh_1484692514@tfbnw.net,android@test.com,ios@test.com,ios@banana.com,android@banana.com,justinb@banana.com,herb.perv@banana.com}'::text[])) AND ('apple_review'::text <> ALL ((meta_tags)::text[])) AND ('ready234@yahoo.com'::text <> ALL ((matches)::text[])) AND (584065 <> ALL (flagged_user_ids)))
                     Rows Removed by Filter: 194158
                     Buffers: shared hit=11616 read=51044
 Planning Time: 0.376 ms
 Execution Time: 167.300 ms
(16 rows)

ext <> ALL((matches):: text []))AND(584065 <> ALL(flagged_user_ids)))                          筛选器删除的行:194158      计划时间:0.389毫秒      执行时间:167.745毫秒     (12行)

更新2

生产:

 SELECT reltuples, relpages
[more] - > FROM pg_class
[more] - > WHERE relname = 'users';
 reltuples | relpages
-----------+----------
    582557 |   436084
(1 row)

Time: 27.967 ms

本地:

 SELECT reltuples, relpages
[more] - > FROM pg_class
[more] - > WHERE relname = 'users';
 reltuples | relpages
-----------+----------
    582281 |    62660
(1 row)

Time: 1.816 ms

更新3:

在查看了执行计划并四处窥探之后,我认为这可能是由于Parallel Seq Scan引起的。但是,我尝试了SET max_parallel_workers_per_gather TO 4;,即使使用Parallel Seq Scan,它仍然似乎非常慢。看来I / O需要很长时间:

生产:

Limit  (cost=439013.82..439013.82 rows=1 width=834) (actual time=36739.071..36739.071 rows=0 loops=1)
   Buffers: shared hit=5176 read=430908
   I/O Timings: read=182157.859
   ->  Sort  (cost=439013.82..439013.82 rows=1 width=834) (actual time=36739.069..36739.069 rows=0 loops=1)
         Sort Key: updated_at
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=5176 read=430908
         I/O Timings: read=182157.859
         ->  Gather  (cost=1000.00..439013.82 rows=1 width=834) (actual time=36739.061..36744.559 rows=0 loops=1)
               Workers Planned: 4
               Workers Launched: 4
               Buffers: shared hit=5176 read=430908
               I/O Timings: read=182157.859
               ->  Parallel Seq Scan on users u0  (cost=0.00..438013.72 rows=1 width=834) (actual time=36727.506..36727.506 rows=0 loops=5)
                     Filter: (looking AND ((email)::text <> 'ready234@yahoo.com'::text) AND ((email)::text <> ALL ('{charlie_pevijom_angel@tfbnw.net,indzdugsyh_1487230198@tfbnw.net,iznngyqvfh_1484692514@tfbnw.net,android@test.com,ios@test.com,ios@banana.com,android@banana.com,justinb@banana.com,herb.perv@banana.com}'::text[])) AND ('apple_review'::text <> ALL ((meta_tags)::text[])) AND ('ready234@yahoo.com'::text <> ALL ((matches)::text[])) AND (584065 <> ALL (flagged_user_ids)))
                     Rows Removed by Filter: 116645
                     Buffers: shared hit=5176 read=430908
                     I/O Timings: read=182157.859
 Planning Time: 0.265 ms
 Execution Time: 36744.622 ms
(20 rows)

1 个答案:

答案 0 :(得分:2)

一些观察(不是您的主要问题):

  • 速度慢的服务器在表中的行数是其2.5倍,但这并不能说明速度降低了250倍。

  • 在慢速计算机上,即使表更大,也没有计划并行工作器。难道是您在慢速服务器上运行的是9.6之前的PostgreSQL版本?如果不是,则可能禁用了并行性({max_parallel_workers_per_gathermax_parallel_workers设置为0,或者某些其他相关的配置参数已更改)。但这也不能说明减速幅度为250倍。

您的实际问题如下:

  • 顺序扫描必须从速度较慢的计算机上的辅助存储中读取大多数块。 EXPLAIN (ANALYZE, BUFFERS)显示几乎所有块都已读取,并且由于您启用了track_io_timing,我们知道查询所花费的41秒中有40秒是花在I / O上的。

以下两点加剧了这一点:

  • 慢速系统上的表非常膨胀(表的6/7似乎是空的,因为表中的元组大约是七倍大)。因此,查询必须读取所需块的七倍。

    下一次您可以考虑运行VACUUM (FULL) users时,您可以承受一些停机时间并减少autovacuum_vacuum_cost_delay的设置。

  • 慢服务器上的慢I / O子系统可能是导致此问题的原因。

您的WHERE条件全部以无法利用索引的方式编写。这实际上是一个需要顺序扫描的查询。

您可以将内存用于问题(至少4GB),然后使用pg_prewarm将表加载到缓存中。或者,您可以获得非常快速的SSD存储。我认为您无法做更多的事情。