Question

我返回一个查询，该查询平均需要170秒才能执行。我浏览了PSQL文档，他们提到如果我们增加work_mem，性能将会提高。我将work_mem增加到1000 MB，即使性能没有改善。

注意：我为查询的所有字段建立了索引。

下面，我粘贴数据库中存在的记录，查询执行计划，查询，结果。

数据库中存在的记录数：

event_logs=> select count(*) from events;
  count   
----------
 18706734
(1 row)

查询：

select raw->'request_payload'->'source'->0 as file, 
       count(raw->'request_payload'->>'status') as count, 
       raw->'request_payload'->>'status' as status 
from events 
where client = 'NTT' 
  and to_char(datetime, 'YYYY-MM-DD') = '2019-10-31' 
  and event_name = 'wbs_indexing' 
group by raw->'request_payload'->'source'->0, 
         raw->'request_payload'->>'status';

结果：

 file                   | count  | status  
-----------------------------+--------+--
 "xyz.csv"              |  91878 | failure
 "abc.csv"              |  91816 | failure
 "efg.csv"              | 398196 | failure
(3 rows)

默认的work_mem（4 MB）查询执行计划：

event_logs=> SHOW work_mem;
 work_mem 
----------
 4MB
(1 row)

event_logs=> explain analyze select raw->'request_payload'->'source'->0 as file, count(raw->'request_payload'->>'status') as count,  raw->'request_payload'->>'status' as status from events where to_char(datetime, 'YYYY-MM-DD') = '2019-10-31' and client = 'NTT'  and event_name = 'wbs_indexing' group by raw->'request_payload'->'source'->0, raw->'request_payload'->>'status';
                                                                             QUERY PLAN                                                       

----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
 Finalize GroupAggregate  (cost=3256017.54..3267087.56 rows=78474 width=72) (actual time=172547.598..172965.581 rows=3 loops=1)
   Group Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
   ->  Gather Merge  (cost=3256017.54..3264829.34 rows=65674 width=72) (actual time=172295.204..172965.630 rows=9 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial GroupAggregate  (cost=3255017.52..3256248.91 rows=32837 width=72) (actual time=172258.342..172737.534 rows=3 loops=3)
               Group Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
               ->  Sort  (cost=3255017.52..3255099.61 rows=32837 width=533) (actual time=171794.584..172639.670 rows=193963 loops=3)
                     Sort Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
                     Sort Method: external merge  Disk: 131856kB
                     ->  Parallel Seq Scan on events  (cost=0.00..3244696.75 rows=32837 width=533) (actual time=98846.155..169311.063 rows=193963 loops=3)
                           Filter: ((client = 'NTT'::text) AND (event_name = 'wbs_indexing'::text) AND (to_char(datetime, 'YYYY-MM-DD'::text) = '2019-10-31'::text))
                           Rows Removed by Filter: 6041677
 Planning time: 0.953 ms
 Execution time: 172983.273 ms
(15 rows)

增加了work_mem（1000 MB）查询执行计划：

event_logs=> SHOW work_mem;
 work_mem 
----------
 1000MB
(1 row)

event_logs=> explain analyze select raw->'request_payload'->'source'->0 as file, count(raw->'request_payload'->>'status') as count,  raw->'request_payload'->>'status' as status from events where to_char(datetime, 'YYYY-MM-DD') = '2019-10-31' and client = 'NTT'  and event_name = 'wbs_indexing' group by raw->'request_payload'->'source'->0, raw->'request_payload'->>'status';
                                                                            QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=3248160.04..3259230.06 rows=78474 width=72) (actual time=167979.419..168189.228 rows=3 loops=1)
   Group Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
   ->  Gather Merge  (cost=3248160.04..3256971.84 rows=65674 width=72) (actual time=167949.951..168189.282 rows=9 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial GroupAggregate  (cost=3247160.02..3248391.41 rows=32837 width=72) (actual time=167945.607..168083.707 rows=3 loops=3)
               Group Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
               ->  Sort  (cost=3247160.02..3247242.11 rows=32837 width=533) (actual time=167917.891..167975.549 rows=193963 loops=3)
                     Sort Key: ((((raw -> 'request_payload'::text) -> 'source'::text) -> 0)), (((raw -> 'request_payload'::text) ->> 'status'::text))
                     Sort Method: quicksort  Memory: 191822kB
                     ->  Parallel Seq Scan on events  (cost=0.00..3244696.75 rows=32837 width=533) (actual time=98849.936..167570.669 rows=193963 loops=3)
                           Filter: ((client = 'NTT'::text) AND (event_name = 'wbs_indexing'::text) AND (to_char(datetime, 'YYYY-MM-DD'::text) = '2019-10-31'::text))
                           Rows Removed by Filter: 6041677
 Planning time: 0.238 ms
 Execution time: 168199.046 ms
(15 rows)

有人可以帮助我改善此查询的性能吗？

Answer 1

增加work_mem的确使排序速度提高了大约8倍：(172639.670 - 169311.063) / (167975.549 - 167570.669)。但是，由于排序仅占整体执行时间的一小部分，因此使其速度提高甚至1000倍也无法使整体效果更好。 seq扫描占用了时间。

seq扫描中的大部分时间可能花费在IO上。您可以在打开track_io_timing后运行EXPLAIN (ANALYZE, BUFFERS)进行查看。

此外，并行化seq扫描通常不是很有用，因为由于预读的魔力，IO系统通常能够将其全部容量交付给单个读取器。有时并行阅读器甚至可以互相踩脚，使整体性能变差。您可以使用set max_parallel_workers_per_gather TO 0;禁用并行化，这可能会使事情变得更快，否则，至少会使EXPLAIN计划更容易理解。

您要获取超过3％的表格：193963 / (193963 + 6041677)。当您获取大量索引时，索引可能不是很有用。如果要做到这一点，则需要组合索引，而不是单个索引。因此，您需要在(client, event_name, date(datetime))上建立索引。然后，您还需要将查询更改为使用date(datetime)而不是to_char(datetime, 'YYYY-MM-DD')。您需要进行此更改，因为to_char不是不可变的，因此无法编制索引。

Answer 2

通过修改查询解决了该问题。这是 to_char 方法的问题。它会将日期对象转换为表中每条记录上的字符串日期，以与给定的字符串日期匹配。因此，我更新了查询，就像提取给定日期和第二天日期之间的记录一样。现在可以在500毫秒内得到响应。

即使增加了work_mem的大小，性能也不会提高

2 个答案: