提高查询性能

时间:2020-06-23 15:46:01

标签: postgresql query-performance

我们有一个带有多个表和左外部联接的PostgreSQL查询,并且运行速度非常慢。 它需要25到40秒才能完成,因此我们想对其进行更多优化,并将运行时间减少到1-2秒。

 select a.campaignid, b.campaign_name , case when b.message_type_id = 1 then 'Promotional'
 when b.message_type_id = 2 then 'Transactional'
 else 'Other' end as Campaign_type, c.username , aggregator_type,
 e.cli_manager_id as senderID, 
 b.schedule_time  as campaign_schedule_date,
 count(a.mobile) as campaign_submitted_count, count(case when a.status = 'DELIVRD' then mobile          end) as Delivered,
 count(a.mobile) as Total_count,
 count(case when a.status = 'FAILED' then mobile end) as failure_count,
 count(case when a.status = 'DND_check_failed' then mobile end) as DND_count,
 sum(credits_used) as credits_used   
 from tbl_cdr_test a left outer join tbl_campaign b 
 on a.campaignid  = b.tbl_campaign_id left outer join tbl_users_master c
 on b.user_id =c.user_master_id 
 left outer join tbl_cli_manager e on b.user_id = e.user_id
 left outer join tbl_user_channel f on b.user_id =f.user_id
 left outer join tbl_user_configurations g on b.user_id = g.user_id
where date(insert_datetime) between '2020-05-23' and '2020-06-23'
and c.username = coalesce(null, c.username)
and g.msg_cat_id = coalesce(null, g.msg_cat_id)
and a.campaignid = coalesce(null, a.campaignid)
and e.cli_manager_id = coalesce(null, e.cli_manager_id)
group by a.campaignid, b.campaign_name , b.message_type_id,c.username ,  b.schedule_time,
aggregator_type, e.cli_manager_id;

我们也创建了适当的索引,但是仍然需要时间。

此外,执行计划中有一种“外部合并磁盘”排序方法,而要解决该问题,我将work_mem = 50MB设置为。仍然使用磁盘排序而不是内存。请建议

下面是执行计划:

 GroupAggregate  (cost=4872.01..4872.07 rows=1 width=543) (actual time=20564.239..27415.264 rows=8 loops=1)
   Group Key: a.campaignid, b.campaign_name, b.message_type_id, c.username, b.schedule_time, f.aggregator_type, e.cli_manager_id
   ->  Sort  (cost=4872.01..4872.01 rows=1 width=483) (actual time=19627.424..25020.702 rows=3206196 loops=1)
         Sort Key: a.campaignid, b.campaign_name, b.message_type_id, c.username, b.schedule_time, f.aggregator_type, e.cli_manager_id
         Sort Method: external merge  Disk: 281456kB
         ->  Nested Loop  (cost=22.03..4872.00 rows=1 width=483) (actual time=99.704..12086.244 rows=3206196 loops=1)
               Join Filter: (b.user_id = g.user_id)
               ->  Nested Loop Left Join  (cost=21.89..4871.79 rows=1 width=495) (actual time=99.688..4518.533 rows=3206196 loops=1)
                     ->  Nested Loop  (cost=21.75..4871.54 rows=1 width=77) (actual time=99.664..935.689 rows=356244 loops=1)
                           ->  Nested Loop  (cost=21.33..31.57 rows=1 width=65) (actual time=0.295..2.376 rows=588 loops=1)
                                 Join Filter: (b.user_id = c.user_master_id)
                                 ->  Merge Join  (cost=21.18..30.22 rows=6 width=46) (actual time=0.246..0.663 rows=588 loops=1)
                                       Merge Cond: (e.user_id = b.user_id)
                                       ->  Index Scan using "idx_FK_7hc6agd_tbl_cli_ma_1592228110_32" on tbl_cli_manager e  (cost=0.42..6281.84 rows=762 width=12) (actual time=0.014..0.035 rows=5 loops=1)
                                             Filter: (cli_manager_id = COALESCE(cli_manager_id))
                                       ->  Sort  (cost=20.76..21.13 rows=147 width=34) (actual time=0.225..0.333 rows=585 loops=1)
                                             Sort Key: b.user_id
                                             Sort Method: quicksort  Memory: 36kB
                                             ->  Seq Scan on tbl_campaign b  (cost=0.00..15.47 rows=147 width=34) (actual time=0.013..0.154 rows=147 loops=1)
                                 ->  Index Scan using ind_user_master_c_user on tbl_users_master c  (cost=0.14..0.21 rows=1 width=19) (actual time=0.002..0.002 rows=1 loops=588)
                                       Index Cond: (user_master_id = e.user_id)
                                       Filter: ((username)::text = (COALESCE(username))::text)
                           ->  Append  (cost=0.42..4839.94 rows=3 width=20) (actual time=0.546..1.426 rows=606 loops=588)
                                 ->  Index Scan using testh11_campaignid_idx on testh11 a  (cost=0.42..4253.99 rows=2 width=20) (actual time=0.543..0.543 rows=0 loops=588)
                                       Index Cond: (campaignid = b.tbl_campaign_id)
                                       Filter: ((campaignid = COALESCE(campaignid)) AND (date(insert_datetime) >= '2020-05-23'::date) AND (date(insert_datetime) <= '2020-06-23'::date))
                                       Rows Removed by Filter: 656
                                 ->  Index Scan using testh21_campaignid_idx on testh21 a_1  (cost=0.42..585.94 rows=1 width=20) (actual time=0.002..0.796 rows=606 loops=588)
                                       Index Cond: (campaignid = b.tbl_campaign_id)
                                       Filter: ((campaignid = COALESCE(campaignid)) AND (date(insert_datetime) >= '2020-05-23'::date) AND (date(insert_datetime) <= '2020-06-23'::date))
                     ->  Index Scan using idx_user_id_tbl_user_c_1592227657_19 on tbl_user_channel f  (cost=0.14..0.24 rows=1 width=422) (actual time=0.002..0.004 rows=9 loops=356244)
                           Index Cond: (user_id = b.user_id)
               ->  Index Scan using "idx_FK_6958qvy_tbl_user_c_1592228774_151" on tbl_user_configurations g  (cost=0.14..0.20 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=3206196)
                     Index Cond: (user_id = e.user_id)
                     Filter: (msg_cat_id = COALESCE(msg_cat_id))
 Planning Time: 6.561 ms
 Execution Time: 27477.860 ms

1 个答案:

答案 0 :(得分:0)

testh21上进行索引扫描时,结果行总被低估了。结果是PostgreSQL选择了嵌套循环联接,这是您花费时间的地方。

尝试以下操作:

  • 新统计信息:

     ANALYZE testh21;
    

    如果这样可以改善估算值,请确保自动分析会更频繁地处理表格。

  • 防止因相关性造成的错误估算:

     CREATE STATISTICS testh21_stat (dependencies)
        ON campaignid, insert_datetime FROM testh21;
     ANALYZE testh21;
    

    也许各列之间存在相关性,从而改善了估计。

  • 更详细的统计信息:尝试在表格的default_statistics_target之前提高ANALYZE

如果您无法改善估算值,请锤击并在查询期间设置enable_nestloop = off