在执行任何其他SQL之前先执行子查询重构

时间:2019-04-11 09:20:25

标签: sql oracle performance subquery query-optimization

我有一个非常复杂的视图,其格式如下

create or replace view loan_vw as 
select * from (with loan_info as (select loan_table.*,commission_table.* 
                                   from loan_table,
                                  commission_table where 
                                  contract_id=commission_id)
                select /*complex transformations */ from loan_info
                where type <> 'PRINCIPAL'
                union all 
                select /*complex transformations */ from loan_info
                where type = 'PRINCIPAL')

现在,如果执行以下操作,则查询将挂起

         select * from loan_vw where contract_id='HA001234TY56';

但是,如果我在子查询重构中进行硬编码或在同一会话中使用包级变量,则查询将在一秒钟内返回

create or replace view loan_vw as 
        select * from (with loan_info as (select loan_table.*,commission_table.* 
                                           from loan_table,
                                          commission_table where 
                                          contract_id=commission_id
                                          and contract_id='HA001234TY56'
                                          )
                        select /*complex transformations */ from loan_info
                        where type <> 'PRINCIPAL'
                        union all 
                        select /*complex transformations */ from loan_info
                        where type = 'PRINCIPAL')

由于我使用Business对象,因此无法使用包级变量

所以我的问题是Oracle中有一个提示,告诉优化器首先在子查询重构中检查loan_vw中的contract_id。

根据要求,使用的分析函数如下

select value_date, item, credit_entry, item_paid
from (
  select value_date, item, credit_entry, debit_entry,
    greatest(0, least(credit_entry, nvl(sum(debit_entry) over (), 0)
      - nvl(sum(credit_entry) over (order by value_date
          rows between unbounded preceding and 1 preceding), 0))) as item_paid
  from your_table
)
where item is not null;

在遵循Boneist和MarcinJ的建议之后,我删除了Sub查询重构(CTE),并编写了一个如下所示的长查询,将性能从3分钟提高到了0.156秒。

  create or replace view loan_vw as
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type <> 'PRINCIPAL'
  union all
  select /*complex transformations */
                               from loan_table,
                              commission_table where 
                              contract_id=commission_id
               and loan_table.type = 'PRINCIPAL'

2 个答案:

答案 0 :(得分:4)

这些转换真的需要使用UNION ALL这么复杂吗?优化您看不到的东西确实很困难,但是您是否尝试过摆脱CTE并内联实现计算?

CREATE OR REPLACE VIEW loan_vw AS
SELECT loan.contract_id
     , CASE commission.type -- or wherever this comes from
         WHEN 'PRINCIPAL'
         THEN SUM(whatever) OVER (PARTITION BY loan.contract_id, loan.type) -- total_whatever

         ELSE SUM(something_else) OVER (PARTITION BY loan.contract_id, loan.type) -- total_something_else
      END AS whatever_something
  FROM loan_table loan 
 INNER 
  JOIN commission_table commission
    ON loan.contract_id = commission.commission_id

请注意,如果您的分析函数没有PARTITION BY contract_id,则将根本无法在该contract_id列上使用索引。

Take a look at this db fiddle(您必须在最后一个结果表上单击...才能扩展结果)。在这里,loan表具有一个索引(PK)contract_id列,还有一个some_other_id,它也是唯一的,但没有索引,并且外部查询的谓词仍在{{1 }}。如果比较contract_idpartition by contract的计划,您会发现partition by other id计划中根本没有使用索引:partition by other id中有一个TABLE ACCESSFULL中的INDEX-UNIQUE SCAN相比,贷款表中的所有选项。这显然是因为优化器无法自行解决partition by contractcontract_id之间的关系,因此它需要在整个窗口上运行some_other_idSUM而不是整个窗口通过索引使用限制窗口行数。

如果您有包含这些合同的维度表,您还可以尝试将其加入结果,并从维度表(而不是最可能的巨额贷款事实表)中暴露AVG。有时,这可以通过使用维表上的唯一索引来改善基数估计。

同样,在没有查询甚至计划的情况下,优化黑匣子确实非常困难,因此我们不知道发生了什么。例如,可以不必要地实现CTE或子查询。

答案 1 :(得分:1)

感谢您进行更新,以包括列列表的示例。

鉴于您的更新查询,我建议您更改视图(或者,如果您的原始视图可用于查询多个contract_id,则可以创建第二个视图来查询单个contract_id-当然,除非原始视图的结果仅适用于单个contract_ids!),例如:

CREATE OR REPLACE VIEW loan_vw AS 
WITH loan_info AS (SELECT l.*, c.* -- for future-proofing, you should list the column names explicitly; if this statement is rerun and there's a column with the same name in both tables, it'll fail.
                   FROM   loan_table l
                          INNER JOIN commission_table c ON l.contract_id = c.commission_id -- you should always alias the join condition columns for ease of maintenance.
                  )
SELECT value_date,
     item,
     credit_entry,
     debit_entry,
     GREATEST(0,
            LEAST(credit_entry,
                NVL(SUM(debit_entry) OVER (PARTITION BY contract_id), 0)
                  - NVL(SUM(credit_entry) OVER (PARTITION BY contract_id ORDER BY value_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0))) AS item_paid
FROM   loan_info
WHERE  TYPE <> 'PRINCIPAL'
UNION ALL
SELECT ...
FROM   loan_info
WHERE  TYPE = 'PRINCIPAL';

请注意,我已将您的联接转换为ANSI语法,因为它比旧式联接更容易理解(更容易将联接条件与谓词分开,一开始!)。