为什么将查询放入函数后执行查询的速度慢了76倍?

时间:2019-07-12 08:30:31

标签: postgresql query-planner postgresql-12

当我将下一个查询放入函数中时,它的运行速度降低了76倍。 计划中唯一的区别是:位图索引扫描VS索引扫描

计划1:http://tatiyants.com/pev/#/plans/plan_1562919134481 enter image description here

计划2:http://tatiyants.com/pev/#/plans/plan_1562918860704 enter image description here

plan1

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
        SELECT
            sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma,
            *
        FROM (
            SELECT
             sum( ocd.item_cost     ) AS group_cost,
             sum( ocd.item_suma     ) AS group_suma,
             max( (ocd.ic).consumed ) AS consumed,
             (ocd.ic).consumed_period,
             ocd.o
            FROM order_cost_details( tstzrange( '2019-04-01', '2019-05-01' ) ) ocd
            GROUP BY ocd.o, (ocd.ic).consumed_period
        ) t
WHERE (t.o).id IN ( 6154 ) AND t.consumed_period @> '2019-04-01'::timestamptz
;

Plan2

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT * FROM order_total_suma( tstzrange( '2019-04-01', '2019-05-01' ) ) ots 
WHERE (ots.o).id IN ( 6154 ) AND ots.consumed_period @> '2019-04-01'::timestamptz
;

功能:

CREATE FUNCTION "order_total_suma" (in _target_range tstzrange default app_period())
 RETURNS    table(
        total_suma  double precision,
        group_cost  double precision,
        group_suma  double precision,
        consumed    double precision,
        consumed_period tstzrange,
        o order_bt
    )

 LANGUAGE sql
 STABLE
 AS $$
    SELECT
        sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma,
        *
    FROM (
        SELECT
         sum( ocd.item_cost     ) AS group_cost,
         sum( ocd.item_suma     ) AS group_suma,
         max( (ocd.ic).consumed ) AS consumed,
         (ocd.ic).consumed_period,
         ocd.o
        FROM order_cost_details( _target_range ) ocd
        GROUP BY ocd.o, (ocd.ic).consumed_period
    ) t
$$
;

为什么要对函数内部的查询在上一次子查询扫描时进行过滤?

enter image description here

是否可以做一些事情使它们平等地工作?

UPD
服务器版本为PostgreSQL 12beta2
由于30000个字符的限制,我发布了计划herehere

2 个答案:

答案 0 :(得分:2)

感谢IRC中的 RhodiumToad

  

我怀疑是某种原因阻止了计划人员推断(t.o).id可以安全地通过ocd.o进行GROUP BY

     

可以通过将其设置为单独的列来解决

因此,我另外counts odc.id列。所以我最后的查询是:

docs.forEach(function(obj){
    var choice = obj['choices'];
    var choice_value = 0;
    counts.forEach(function(choice){
       if(choice['_id'] == choice)
       {
          choice_value = choice['choice_count'];
          break;
       }
    })
    obj['choice_count']=choice_value;
})
//to check if you have the count in the same document use print. 
print(JSON.stringify(docs));

此更改还使通过函数的调用更快。我只需要通过GROUP BY字段进行排序:

    SELECT * FROM (
            SELECT
                sum( t.group_suma ) OVER( PARTITION BY t.order_id ) AS total_suma,
--              sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma,  -- For any WHERE this takes 2700ms
                *
            FROM (
                SELECT
                 sum( ocd.item_cost     ) AS group_cost,
                 sum( ocd.item_suma     ) AS group_suma,
                 max( (ocd.ic).consumed ) AS consumed,
                 (ocd.ic).consumed_period,
                 ocd.o,
                 (ocd.o).id as order_id
                FROM order_cost_details( tstzrange( '2019-04-01', '2019-05-01' ) ) ocd
                GROUP BY ocd.o, (ocd.o).id, (ocd.ic).consumed_period
            ) t
    ) t
    WHERE t.order_id = 6154 AND t.consumed_period @> '2019-04-01'::timestamptz       -- This takes 2ms
--  WHERE (t.o).id = 6154 AND t.consumed_period @> '2019-04-01'::timestamptz   -- This takes 2700ms

答案 1 :(得分:0)

计划完全不同。

问题是public.order_btsplit_period子查询之间的联接结果计数被错误估计。这会使函数public.service_level_price的求值时间是2882次,而不是一次,这是花费时间的地方。

不确定如何处理(我们没有视图定义,这可能很讨厌)。提升函数的COST可能无济于事,因为优化器认为它只会调用一次。

实际上,最好的选择可能是

ALTER FUNCTION public.calc_item_suma ROWS 1;

这可能会使优化程序选择其他计划。