postgres窗口函数高音查询时间

时间:2018-04-27 11:06:10

标签: postgresql

我正在使用postgres 10,并有以下查询

select 
    count(task.id) over() as _total_ ,
    json_agg(u.*) as users, 
    task.* 

    from task  
        left outer join taskuserlink_history tu on (task.id = tu.taskid) 
            left outer join "user" u on (tu.userId = u.id) 

    group by task.id offset 10 limit 10;

此查询需要大约800毫秒才能执行

如果我删除count(task.id) over() as _total_ ,行,则会在250毫秒内执行

我必须承认自己是一个完整的sql noob,所以查询本身可能会完全被堵塞

我想知道是否有人可以指出查询中的缺陷,并提出如何加快速度的建议。

tasks的数量约为15k,每项任务平均为users,通过taskuserlink链接

我看过pgadmin“解释”图

enter image description here

但说实话,还不能真正弄明白;)

表定义是

taskid(int)为主要列

taskuserlink_historytaskId(int)和userId(int)(两者都作为外键约束,已编入索引)

userid(int)为主要列

查询计划如下

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4.74..12.49 rows=10 width=44) (actual time=1178.016..1178.043 rows=10 loops=1)
   Buffers: shared hit=3731, temp read=6655 written=6914
   ->  WindowAgg  (cost=4.74..10248.90 rows=13231 width=44) (actual time=1178.014..1178.040 rows=10 loops=1)
         Buffers: shared hit=3731, temp read=6655 written=6914
         ->  GroupAggregate  (cost=4.74..10083.51 rows=13231 width=36) (actual time=0.417..1049.294 rows=13255 loops=1)
               Group Key: task.id
               Buffers: shared hit=3731
               ->  Nested Loop Left Join  (cost=4.74..9586.77 rows=66271 width=36) (actual time=0.103..309.372 rows=66162 loops=1)
                     Join Filter: (taskuserlink_history.userid = user_archive.id)
                     Rows Removed by Join Filter: 1182904
                     Buffers: shared hit=3731
                     ->  Merge Left Join  (cost=0.58..5563.22 rows=66271 width=8) (actual time=0.044..73.598 rows=66162 loops=1)
                           Merge Cond: (task.id = taskuserlink_history.taskid)
                           Buffers: shared hit=3629
                           ->  Index Only Scan using task_pkey on task  (cost=0.29..1938.30 rows=13231 width=4) (actual time=0.026..7.683 rows=13255 loops=1)
                                 Heap Fetches: 13255
                                 Buffers: shared hit=1810
                           ->  Index Scan using taskuserlink_history_task_fk_idx on taskuserlink_history  (cost=0.29..2764.46 rows=66271 width=8) (actual time=0.015..40.109 rows=66162 loops=1)
                                 Filter: (timeend IS NULL)
                                 Rows Removed by Filter: 13368
                                 Buffers: shared hit=1819
                     ->  Materialize  (cost=4.17..50.46 rows=4 width=36) (actual time=0.000..0.001 rows=19 loops=66162)
                           Buffers: shared hit=102
                           ->  Bitmap Heap Scan on user_archive  (cost=4.17..50.44 rows=4 width=36) (actual time=0.050..0.305 rows=45 loops=1)
                                 Recheck Cond: (archived_at IS NULL)
                                 Heap Blocks: exact=11
                                 Buffers: shared hit=102
                                 ->  Bitmap Index Scan on user_unique_username  (cost=0.00..4.16 rows=4 width=0) (actual time=0.014..0.014 rows=46 loops=1)
                                       Buffers: shared hit=1
                                 SubPlan 1
                                   ->  Aggregate  (cost=8.30..8.31 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=45)
                                         Buffers: shared hit=90
                                         ->  Index Scan using task_assignedto_idx on task task_1  (cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=45)
                                               Index Cond: (assignedtoid = user_archive.id)
                                               Buffers: shared hit=90
 Planning time: 0.989 ms
 Execution time: 1191.451 ms
(37 rows)

没有窗口功能

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4.74..12.36 rows=10 width=36) (actual time=0.510..1.763 rows=10 loops=1)
   Buffers: shared hit=91
   ->  GroupAggregate  (cost=4.74..10083.51 rows=13231 width=36) (actual time=0.509..1.759 rows=10 loops=1)
         Group Key: task.id
         Buffers: shared hit=91
         ->  Nested Loop Left Join  (cost=4.74..9586.77 rows=66271 width=36) (actual time=0.073..0.744 rows=50 loops=1)
               Join Filter: (taskuserlink_history.userid = user_archive.id)
               Rows Removed by Join Filter: 361
               Buffers: shared hit=91
               ->  Merge Left Join  (cost=0.58..5563.22 rows=66271 width=8) (actual time=0.029..0.161 rows=50 loops=1)
                     Merge Cond: (task.id = taskuserlink_history.taskid)
                     Buffers: shared hit=7
                     ->  Index Only Scan using task_pkey on task  (cost=0.29..1938.30 rows=13231 width=4) (actual time=0.016..0.031 rows=11 loops=1)
                           Heap Fetches: 11
                           Buffers: shared hit=4
                     ->  Index Scan using taskuserlink_history_task_fk_idx on taskuserlink_history  (cost=0.29..2764.46 rows=66271 width=8) (actual time=0.009..0.081 rows=50 loops=1)
                           Filter: (timeend IS NULL)
                           Rows Removed by Filter: 11
                           Buffers: shared hit=3
               ->  Materialize  (cost=4.17..50.46 rows=4 width=36) (actual time=0.001..0.009 rows=8 loops=50)
                     Buffers: shared hit=84
                     ->  Bitmap Heap Scan on user_archive  (cost=4.17..50.44 rows=4 width=36) (actual time=0.040..0.382 rows=38 loops=1)
                           Recheck Cond: (archived_at IS NULL)
                           Heap Blocks: exact=7
                           Buffers: shared hit=84
                           ->  Bitmap Index Scan on user_unique_username  (cost=0.00..4.16 rows=4 width=0) (actual time=0.012..0.012 rows=46 loops=1)
                                 Buffers: shared hit=1
                           SubPlan 1
                             ->  Aggregate  (cost=8.30..8.31 rows=1 width=8) (actual time=0.005..0.005 rows=1 loops=38)
                                   Buffers: shared hit=76
                                   ->  Index Scan using task_assignedto_idx on task task_1  (cost=0.29..8.30 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=38)
                                         Index Cond: (assignedtoid = user_archive.id)
                                         Buffers: shared hit=76
 Planning time: 0.895 ms
 Execution time: 1.890 ms
(35 rows)|

1 个答案:

答案 0 :(得分:0)

我相信LIMIT条款正在发挥作用。 LIMIT限制了返回的行数,而不是必须涉及的工作:

  • 您的第二个查询可以在构建20行后提前中止(OFFSET为10,LIMIT为10)。
  • 但是,您的第一个查询需要通过整个集来计算计数(task.id)。

不是你问的问题,但无论如何我都这么说:

  • “user”不是表格,而是视图。那就是两个查询实际上都比它们应该慢得多(计划中的“实现”)。
  • 使用OFFSET进行寻呼呼叫,因为当OFFSET增加时它会变慢
  • 在没有ORDER BY的情况下使用OFFSET和LIMIT很可能不是您想要的。连续调用的结果集可能不相同。