Postgres的'计划需要不成比例的执行时间

时间:2017-06-10 09:35:04

标签: postgresql

postgres 9.6在亚马逊RDS上运行。

我有两张桌子:

  1. 聚合事件 - 带6个键(ID)的大表
  2. 广告系列元数据 - 包含广告系列定义的小表格。
  3. 我加入2以过滤广告系列名称等元数据。

    该查询是为了按广告系列渠道和日期(日期为每日)获取显示细分的报告。

    没有FK而不是null。报告表每个广告系列每天有多行(因为聚合基于6个属性键)。

    当我加入时,查询计划增长到10秒(相对于300毫秒)

    explain analyze select c.campaign_channel as channel,date as day , sum( displayed )  as displayed
    from report_campaigns c
    left join events_daily r on r.campaign_id = c.c_id
    where  provider_id = 7726 and c.p_id = 7726 and c.campaign_name <> 'test'
    and date >= '20170513 12:00' and date <= '20170515 12:00'
    group by c.campaign_channel,date;
                                                                                             QUERY PLAN
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     GroupAggregate  (cost=71461.93..71466.51 rows=229 width=22) (actual time=104.189..114.788 rows=6 loops=1)
       Group Key: c.campaign_channel, r.date
       ->  Sort  (cost=71461.93..71462.51 rows=229 width=18) (actual time=100.263..106.402 rows=31205 loops=1)
             Sort Key: c.campaign_channel, r.date
             Sort Method: quicksort  Memory: 3206kB
             ->  Hash Join  (cost=1092.52..71452.96 rows=229 width=18) (actual time=22.149..86.955 rows=31205 loops=1)
                   Hash Cond: (r.campaign_id = c.c_id)
                   ->  Append  (cost=0.00..70245.84 rows=29948 width=20) (actual time=21.318..71.315 rows=31205 loops=1)
                         ->  Seq Scan on events_daily r  (cost=0.00..0.00 rows=1 width=20) (actual time=0.005..0.005 rows=0 loops=1)
                               Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone) AND (provider_id =
                         ->  Bitmap Heap Scan on events_daily_20170513 r_1  (cost=685.36..23913.63 rows=1 width=20) (actual time=17.230..17.230 rows=0 loops=1)
                               Recheck Cond: (provider_id = 7726)
                               Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                               Rows Removed by Filter: 13769
                               Heap Blocks: exact=10276
                               ->  Bitmap Index Scan on events_daily_20170513_full_idx  (cost=0.00..685.36 rows=14525 width=0) (actual time=2.356..2.356 rows=13769 loops=1)
                                     Index Cond: (provider_id = 7726)
                         ->  Bitmap Heap Scan on events_daily_20170514 r_2  (cost=689.08..22203.52 rows=14537 width=20) (actual time=4.082..21.389 rows=15281 loops=1)
                               Recheck Cond: (provider_id = 7726)
                               Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                               Heap Blocks: exact=10490
                               ->  Bitmap Index Scan on events_daily_20170514_full_idx  (cost=0.00..685.45 rows=14537 width=0) (actual time=2.428..2.428 rows=15281 loops=1)
                                     Index Cond: (provider_id = 7726)
                         ->  Bitmap Heap Scan on events_daily_20170515 r_3  (cost=731.84..24128.69 rows=15409 width=20) (actual time=4.297..22.662 rows=15924 loops=1)
                               Recheck Cond: (provider_id = 7726)
                               Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                               Heap Blocks: exact=11318
                               ->  Bitmap Index Scan on events_daily_20170515_full_idx  (cost=0.00..727.99 rows=15409 width=0) (actual time=2.506..2.506 rows=15924 loops=1)
                                     Index Cond: (provider_id = 7726)
                   ->  Hash  (cost=1085.35..1085.35 rows=574 width=14) (actual time=0.815..0.815 rows=582 loops=1)
                         Buckets: 1024  Batches: 1  Memory Usage: 37kB
                         ->  Bitmap Heap Scan on report_campaigns c  (cost=12.76..1085.35 rows=574 width=14) (actual time=0.090..0.627 rows=582 loops=1)
                               Recheck Cond: (p_id = 7726)
                               Filter: ((campaign_name)::text <> 'test'::text)
                               Heap Blocks: exact=240
                               ->  Bitmap Index Scan on report_campaigns_provider_id  (cost=0.00..12.62 rows=577 width=0) (actual time=0.062..0.062 rows=582 loops=1)
                                     Index Cond: (p_id = 7726)
     Planning time: 9651.605 ms
     Execution time: 115.092 ms
    
    
    result:
     channel  |         day         | displayed
    ----------+---------------------+-----------
     Pin      | 2017-05-14 00:00:00 |   43434
     Pin      | 2017-05-15 00:00:00 |   3325325235
    

1 个答案:

答案 0 :(得分:0)

在我看来,这是因为在离开加入之前总结了强制预先计算。

解决方案可能是在左连接和求和之前在两个嵌套的子SELECT中强制过滤WHERE子句。

希望这有效:

data %>% myFunction(...)