我可以以某种方式提供额外的上下文,以允许postgres有效地排序/限制视图而无需计算其行吗?

时间:2019-06-12 15:16:26

标签: database postgresql optimization view

给出这个人为设计的查询

select id, pg_sleep(0.001)::text from administrative_areas;

当我添加订单并直接对其进行限制时,睡眠仅执行一次,结果将快速返回。

> explain analyze select id, pg_sleep(0.001)::text from administrative_areas order by id desc limit 1;

Limit  (cost=0.28..0.39 rows=1 width=36) (actual time=4.227..4.228 rows=1 loops=1)
  ->  Index Only Scan Backward using administrative_areas_pkey on administrative_areas  (cost=0.28..69.50 rows=604 width=36) (actual time=4.227..4.227 rows=1 loops=1)
        Heap Fetches: 1
Planning time: 0.066 ms
Execution time: 4.243 ms

如果我在视图中抛出相同的查询

CREATE OR REPLACE VIEW sleepy AS
    select id, pg_sleep(0.001)::text from administrative_areas;

在查询订单并限制睡眠时,底层administrative_areas表中的每个项目都会执行一次睡眠。

> explain analyze select * from sleepy order by id desc limit 1;

Limit  (cost=30.63..30.63 rows=1 width=36) (actual time=3794.827..3794.829 rows=1 loops=1)
  ->  Sort  (cost=30.63..32.14 rows=604 width=36) (actual time=3794.825..3794.825 rows=1 loops=1)
        Sort Key: administrative_areas.id DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  Seq Scan on administrative_areas  (cost=0.00..21.57 rows=604 width=36) (actual time=6.432..3792.566 rows=604 loops=1)
Planning time: 0.072 ms
Execution time: 3794.851 ms

我是否可以添加其他视图或在查询时提供其他上下文,以使计划者对此进行优化?

1 个答案:

答案 0 :(得分:0)

我相信这是因为pg_sleep是一个易失函数。查询视图时,实际上是在执行以下操作:

select id from (select id, pg_sleep(0.001)::text from administrative_areas) order by id desc limit 1;

Postgres在子查询中看到该volatile函数,并为每一行运行它。让我们测试一下。

create table test as select id from generate_series(1, 1000) g(id);
create index on test(id);
analyze test;
create view sleepy as select id, pg_sleep(0.001)::text from test;

explain analyze select * from sleepy order by id desc limit 1;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Limit  (cost=37.50..37.50 rows=1 width=36) (actual time=1640.368..1640.439 rows=1 loops=1)
   ->  Sort  (cost=37.50..40.00 rows=1000 width=36) (actual time=1640.336..1640.358 rows=1 loops=1)
         Sort Key: test.id DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Seq Scan on test  (cost=0.00..22.50 rows=1000 width=36) (actual time=1.511..1623.058 rows=1000 loops=1)
 Planning Time: 0.175 ms
 Execution Time: 1640.617 ms
(7 rows)

这按预期对测试中的每一行运行pg_sleep。

现在尝试稳定的功能:

create function not_so_sleepy() 
  returns void AS 
  $$ 
    select pg_sleep(0.001) 
  $$ language sql 
stable;  -- NOTE: this is just to trick postgres


create view not_as_sleepy as 
  select id, 
  not_so_sleepy()::text 
FROM test;

explain analyze select * 
  from not_as_sleepy 
  order by id desc limit 1;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..0.32 rows=1 width=36) (actual time=0.049..0.198 rows=1 loops=1)
   ->  Index Only Scan Backward using test_id_idx on test  (cost=0.28..43.27 rows=1000 width=36) (actual time=0.024..0.048 rows=1 loops=1)
         Heap Fetches: 1
 Planning Time: 1.786 ms
 Execution Time: 0.308 ms
(5 rows)

在第二种情况下,我们告诉postgres,该函数没有任何副作用,因此可以放心地忽略它。因此,该函数必须标记为稳定或不可变的(当然,它必须实际上是稳定/不可变的),以便Postgres不必打扰运行该函数。