Oracle top-n查询排序性能

时间:2016-08-25 02:31:12

标签: sql oracle query-optimization oracle12c

我是sql优化的新手,我正在尝试理解为什么在IN子句中有多个项目会导致大的性能损失,如果可能的话,如何防止它。以下是我正在使用的或多或少的东西。第二个查询是我现在所拥有的,我正在寻求提高性能。在现实生活中,TABLE_1有数百万行,计划的排序部分的CPU成本为21M。

SELECT 
    TOPNWRAPPER.*, 
    TABLE_2.X, 
    TABLE_2.Y 
FROM 
    TABLE_2, 
    ( 
        SELECT 
            * 
        FROM 
            ( 
                SELECT 
                    /*+ index (TABLE_1 TABLE_1_B_E_F_ID) */ 
                    TABLE_1.ID, 
                    TABLE_1.C, 
                    TABLE_1.B, 
                    TABLE_1.E, 
                    TABLE_1.F
                FROM 
                    TABLE_1 
                WHERE 
                    ( TABLE_1.F IN ( ‘STATE1’ ) ) AND 
                    ( TABLE_1.B= 'SOMETEXT' ) AND 
                    ( TABLE_1.C=1 ) AND 
                    ( TABLE_1.E= 'IN' ) AND 
                    ( TABLE_1.D IS NULL ) 
                ORDER BY 
                    TABLE_1.ID 
            ) 
        WHERE 
            ( ROWNUM <= 150 ) 
    ) TOPNWRAPPER 
WHERE 
    ( TOPNWRAPPER.ID = TABLE_2.T1_ID_FK ) 
ORDER BY 
    TOPNWRAPPER.ID ASC

统计:

|--------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                        | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers ||
|--------------------------------------------------------------------------------------------------------------------------|
||   0 ||SELECT STATEMENT                 |                             |      1 |        |    120 |00:00:00.01 |     965 ||
||   1 |||NESTED LOOPS                    |                             |      1 |        |    120 |00:00:00.01 |     965 ||
||   2 ||||NESTED LOOPS                   |                             |      1 |      1 |    120 |00:00:00.01 |     845 ||
||   3 |||||VIEW                          |                             |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  4 ||||||COUNT STOPKEY                |                             |      1 |        |    120 |00:00:00.01 |     245 ||
||   5 |||||||VIEW                        |                             |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  6 ||||||||TABLE ACCESS BY INDEX ROWID| TABLE_1                     |      1 |      1 |    120 |00:00:00.01 |     245 ||
||*  7 |||||||||INDEX RANGE SCAN          | TABLE_1_B_E_F_ID            |      1 |     25 |    120 |00:00:00.01 |     125 ||
||*  8 |||||INDEX RANGE SCAN              | TABLE_2_T1_ID_FK            |    120 |      1 |    120 |00:00:00.01 |     600 ||
||   9 ||||TABLE ACCESS BY INDEX ROWID    | TABLE_2                     |    120 |      1 |    120 |00:00:00.01 |     120 ||
|--------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                          |
|Predicate Information (identified by operation id):                                                                       |
|---------------------------------------------------                                                                       |
|                                                                                                                          |
|   4 - filter(ROWNUM<=150)                                                                                                |
|   6 - filter((“TABLE_1”.”C”=1 AND “TABLE_1”.”D” IS NULL))                                                                |
|   7 - access(“TABLE_1”.”B”='SOMETEXT' AND                                                                                |
|              “TABLE_1”.”E”=‘IN' AND “TABLE_1”.”F”=’STATE1’)                                                              |
|   8 - access(“TOPNWRAPPER”.”ID”=“TABLE_2”.”T1_ID_FK”)                                                                         |
+--------------------------------------------------------------------------------------------------------------------------+

当我在IN子句中更新查询以获得“STATE2”时,会在计划中添加一个额外的排序步骤。

SELECT 
    TOPNWRAPPER.*, 
    TABLE_2.X, 
    TABLE_2.Y 
FROM 
    TABLE_2, 
    ( 
        SELECT 
            * 
        FROM 
            ( 
                SELECT 
                    /*+ index (TABLE_1 TABLE_1_B_E_F_ID) */ 
                    TABLE_1.ID, 
                    TABLE_1.C, 
                    TABLE_1.B, 
                    TABLE_1.E, 
                    TABLE_1.F
                FROM 
                    TABLE_1 
                WHERE 
                    ( TABLE_1.F IN ( 'STATE1', 'STATE2' ) ) AND 
                    ( TABLE_1.B= 'SOMETEXT' ) AND 
                    ( TABLE_1.C=1 ) AND 
                    ( TABLE_1.E= 'IN' ) AND 
                    ( TABLE_1.D IS NULL ) 
                ORDER BY 
                    TABLE_1.ID 
            ) 
        WHERE 
            ( ROWNUM <= 150 ) 
    ) TOPNWRAPPER 
WHERE 
    ( TOPNWRAPPER.ID = TABLE_2.T1_ID_FK ) 
ORDER BY 
    TOPNWRAPPER.ID ASC

统计:

|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|| Id  | Operation                          | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem ||
|-------------------------------------------------------------------------------------------------------------------------------------------------------|
||   0 ||SELECT STATEMENT                   |                             |      1 |        |    150 |00:00:00.01 |    1076 |       |       |          ||
||   1 |||NESTED LOOPS                      |                             |      1 |        |    150 |00:00:00.01 |    1076 |       |       |          ||
||   2 ||||NESTED LOOPS                     |                             |      1 |      1 |    150 |00:00:00.01 |     926 |       |       |          ||
||   3 |||||VIEW                            |                             |      1 |      1 |    150 |00:00:00.01 |     176 |       |       |          ||
||*  4 ||||||COUNT STOPKEY                  |                             |      1 |        |    150 |00:00:00.01 |     176 |       |       |          ||
||   5 |||||||VIEW                          |                             |      1 |      1 |    150 |00:00:00.01 |     176 |       |       |          ||
||*  6 ||||||||SORT ORDER BY STOPKEY        |                             |      1 |      1 |    150 |00:00:00.01 |     176 | 15360 | 15360 |14336  (0)||
||   7 |||||||||INLIST ITERATOR             |                             |      1 |        |    165 |00:00:00.01 |     176 |       |       |          ||
||*  8 ||||||||||TABLE ACCESS BY INDEX ROWID| TABLE_1                     |      2 |      1 |    165 |00:00:00.01 |     176 |       |       |          ||
||*  9 |||||||||||INDEX RANGE SCAN          | TABLE_1_B_E_F_ID            |      2 |     50 |    165 |00:00:00.01 |      11 |       |       |          ||
||* 10 |||||INDEX RANGE SCAN                | TABLE_2_T1_ID_FK            |    150 |      1 |    150 |00:00:00.01 |     750 |       |       |          ||
||  11 ||||TABLE ACCESS BY INDEX ROWID      | TABLE_2                     |    150 |      1 |    150 |00:00:00.01 |     150 |       |       |          ||
|-------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                       |
|Predicate Information (identified by operation id):                                                                                                    |
|---------------------------------------------------                                                                                                    |
|                                                                                                                                                       |
|   4 - filter(ROWNUM<=150)                                                                                                                             |
|   6 - filter(ROWNUM<=150)                                                                                                                             |
|   8 - filter((“TABLE_1”.”C”=1 AND “TABLE_1”.”D” IS NULL))                                         |
|   9 - access(“TABLE_1”.”B”='SOMETEXT' AND                                                                                |
|              “TABLE_1”.”E”='IN' AND ((“TABLE_1”.”F”='STATE1') OR (“TABLE_1”.”F”='STATE2'))                                              |
|  10 - access(“TOPNWRAPPER”.”ID”=“TABLE_2”.”T1_ID_FK”)                                                              |
|                                                                                                                                                       |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+

我一直在寻找几天。我尝试过的一个建议是使用提示/*+ USE_CONCAT (OR_PREDICATES(1)) */,这有点可以减少一半的内存使用量,但它并没有完全消除这个问题。

编辑:环顾四周(http://use-the-index-luke.com/sql/sorting-grouping/indexed-order-by#tip-ixord-full)并认为这可能是由于订单。如果我将语句的顺序更改为:TABLE_1.F,TABLE_1.IDTOPNWRAPPER.F,TOPNWRAPPER.ID ASC那么排序操作就会消失,遗憾的是我需要基于ID的前n行。或者,我尝试在(ID F)上创建一个新索引进行测试,它也删除了排序操作,但每行ID是唯一的,这使得表访问操作效率降低。

编辑2:

OPERATION      |OPTION           |CPU COST
--------------------------------------------
SORT           |ORDER BY STOPKEY |21042774
|NESTED LOOPS  |OUTER            |56052
||TABLE ACCESS |BY INDEX ROWID   |38980
|||INDEX       |RANGE SCAN       |30086

2 个答案:

答案 0 :(得分:2)

性能差异可能并不重要。执行计划的区别在于,如果前导列使用相等条件,则仅对多列索引访问进行隐式排序。

效果差异

不要过分担心执行计划的成本。即使它被称为“基于成本的优化工具”,但成本却是一个奇怪的数字,世界上只有少数人完全理解。

比较解释计划成本很复杂的一个原因是总成本有时低于儿童运营成本之一。正如我在my answer here中解释的那样,这可能发生在COUNT STOPKEY操作中。这是甲骨文的说法&#34;这个子操作花费这么大的金额,但COUNT STOPKEY可能会在它达到那么高之前切断它#34;。通常最好比较计划的最高成本,但即使这个数字有时也会产生误导,正如该答案中的其他例子所示。

这意味着通常运行时间是唯一重要的事情。如果两次查询的A-Time(实际时间)仅为0.1秒,那么您的工作可能就在这里完成。

执行计划差异

执行计划的差异是由存储和访问多列索引的方式引起的。有时,当扫描索引时,结果将自动存储,有时则不会。这就是为什么一个计划有COUNT STOPKEY而另一个计划的成本更高SORT ORDER BY STOPKEY

要演示此计划差异,请创建一个只有2列和4行的简单表和索引:

create table test1 as 
select 1 a, 10 b from dual union all
select 1 a, 30 b from dual union all
select 2 a, 20 b from dual union all
select 2 a, 40 b from dual;

create index test1_idx on test1(a, b);

begin
    dbms_stats.gather_table_stats(user, 'TEST1');
end;
/

以下是关于如何存储索引的简化概念。数据首先由前导列排序,然后由尾随列排序。

               +----+
        +------+Root+-------+
        |      +----+       |
        |                   |
      +-v-+               +-v-+
   +--+A=1+--+         +--+A=2+--+
   |  +---+  |         |  +---+  |
   |         |         |         |
 +-v--+   +--v-+     +-v--+   +--v-+
 |B=10|   |B=30|     |B=20|   |B=40|
 +----+   +----+     +----+   +----+

如果查询仅访问前导列A中的一个值,则它可以按顺序读取列B中的值,而无需任何额外的工作。 Oracle转到其中一个A块,然后按顺序读取B块,甚至没有尝试。

请注意此查询的ORDER BY如何,但执行计划中没有SORT

explain plan for select * from test1 where a = 1 and b > 0 order by b;
select * from table(dbms_xplan.display(format => 'basic'));

Plan hash value: 598212486

--------------------------------------
| Id  | Operation        | Name      |
--------------------------------------
|   0 | SELECT STATEMENT |           |
|   1 |  INDEX RANGE SCAN| TEST1_IDX |
--------------------------------------

但是如果查询访问前导列A中的多个值,则不一定按顺序检索B的结果。 Oracle可以按顺序读取A块,但B块顺序仅适用于一个 A值。

现在,执行计划中会出现额外的SORT ORDER BY操作。

explain plan for select * from test1 where a in (1,2) and b > 0 order by b;
select * from table(dbms_xplan.display(format => 'basic'));

Plan hash value: 704117715

----------------------------------------
| Id  | Operation          | Name      |
----------------------------------------
|   0 | SELECT STATEMENT   |           |
|   1 |  SORT ORDER BY     |           |
|   2 |   INLIST ITERATOR  |           |
|   3 |    INDEX RANGE SCAN| TEST1_IDX |
----------------------------------------

这就是为什么将column1 = value1更改为column1 in (value1, value2)可能会额外增加SORT次操作。

答案 1 :(得分:0)

使用EXISTS代替IN

示例:

EXISTS (select 1 from DUAL where TABLE_1.F='STATE1' or TABLE_1.F='STATE2')

尝试看看计划是否更改。

如果要使用NOT IN,请使用提示HASH_AJNL_AJ