如何使用另一个表的列中的值来优化列的更新?

时间:2017-09-22 21:15:41

标签: sql oracle performance query-optimization

我在更新表t1中的字段时遇到问题,从表t2中的相同字段中获取值。我的问题是表t1有10万条记录,表t2有20000条。

当我运行时:

update cajas t1
set t1.anio   = (select anio from tempos_Cajas t2
              where t1.cliente_codigo = t2.cliente_codigo
              and t1.caja_codigo = t2.caja_codigo
              and t1.caja_numero = t2.caja_numero
              and t1.cliente_codigo = '115')
where exists(select * from tempos_Cajas t2
              where t1.cliente_codigo = t2.cliente_codigo
              and t1.caja_codigo = t2.caja_codigo
              and t1.caja_numero = t2.caja_numero)

该命令运行了几个小时,我无法更新该字段。

我不是Oracle专家,但我想知道是否有任何方法可以优化SQL语句?

2 个答案:

答案 0 :(得分:0)

对于您的查询,您需要tempos_Cajas(cliente_codigo, caja_codigo, caja_numero, anio)上的综合索引。

那就是说,我认为你想要的逻辑是:

update cajas t1
    set t1.anio   = (select anio
                     from tempos_Cajas t2
                     where t1.cliente_codigo = t2.cliente_codigo and
                           t1.caja_codigo = t2.caja_codigo and
                           t1.caja_numero = t2.caja_numero
                    )
    where exists (select 1
                  from tempos_Cajas t2
                  where t1.cliente_codigo = t2.cliente_codigo and
                        t1.caja_codigo = t2.caja_codigo and
                        t1.caja_numero = t2.caja_numero
                 ) and
          t1.cliente_codigo = '115';

除上述索引外,您还需要cajas(cliente_codigo)上的索引。如果cliente_codigo的类型是数字,那么去掉常量上的单引号。

答案 1 :(得分:0)

在过去,建议是使用IN用于驾驶表 - t1 - 很大并且子查询中的表格t2 - 很小的情况。 WHERE EXISTS被认为更适合于翻转该比率的情况。您的数字(t1 = 10,000,000和t2 = 20,000)符合第一种情况。

然而,Oracle优化器多年来变得越来越聪明。鉴于您发布的情况 - t1(cliente_codigo, caja_codigo, caja_numero)上的综合索引,t2上没有索引 - 此更新产生与where exists版本相同的解释计划(在11gR2和12cR2上):

update cajas t1
set t1.anio   = (select anio from tempos_Cajas t2
              where t1.cliente_codigo = t2.cliente_codigo
              and t1.caja_codigo = t2.caja_codigo
              and t1.caja_numero = t2.caja_numero)
where (t1.cliente_codigo, t1.caja_codigo, t1.caja_numero) in  
       (select t2.cliente_codigo, t2.caja_codigo, t2.caja_numero 
        from tempos_Cajas t2)
;

计划是这样的:

SQL> 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1411510459

--------------------------------------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT     |             |    30M|  1230M|       |   900M  (4)|999:59:59 |
|   1 |  UPDATE              | T1          |       |       |       |            |          |
|   2 |   MERGE JOIN SEMI    |             |    30M|  1230M|       |   136   (2)| 00:00:02 |
|   3 |    INDEX FULL SCAN   | T1_COMP_IDX |    30M|   801M|       |     0   (0)| 00:00:01 |
|*  4 |    SORT UNIQUE       |             | 20000 |   292K|  1112K|   136   (2)| 00:00:02 |
|   5 |     TABLE ACCESS FULL| T2          | 20000 |   292K|       |    29   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS FULL  | T2          |     1 |    28 |       |    29   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."CLIENTE_CODIGO"="T2"."CLIENTE_CODIGO" AND
              "T1"."CAJA_CODIGO"="T2"."CAJA_CODIGO" AND "T1"."CAJA_NUMERO"="T2"."CAJA_NUMERO")
       filter("T1"."CAJA_NUMERO"="T2"."CAJA_NUMERO" AND
              "T1"."CAJA_CODIGO"="T2"."CAJA_CODIGO" AND
              "T1"."CLIENTE_CODIGO"="T2"."CLIENTE_CODIGO")
   6 - filter("T2"."CLIENTE_CODIGO"=:B1 AND "T2"."CAJA_CODIGO"=:B2 AND
              "T2"."CAJA_NUMERO"=:B3)

24 rows selected.

SQL> 

这是一个相当灾难性的计划,因为它击中了驾驶表中的每一行。更好的解决方案是使用MERGE。

merge into t1
    using ( select * from t2 ) t2
    on (t1.cliente_codigo = t2.cliente_codigo
        and   t1.caja_codigo = t2.caja_codigo
        and   t1.caja_numero = t2.caja_numero)  
when matched then
    update set t1.anio = t2.anio ;

这有一个更好的计划:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 525352362

----------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                |             | 20000 |   507K|    29   (0)| 00:00:01 |
|   1 |  MERGE                         | T1          |       |       |            |          |
|   2 |   VIEW                         |             |       |       |            |          |
|   3 |    NESTED LOOPS                |             |       |       |            |          |
|   4 |     NESTED LOOPS               |             | 20000 |  1582K|    29   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL         | T2          | 20000 |   546K|    29   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN          | T1_COMP_IDX |    30M|       |     0   (0)| 00:00:01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| T1          |     1 |    53 |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("T1"."CLIENTE_CODIGO"="T2"."CLIENTE_CODIGO" AND
              "T1"."CAJA_CODIGO"="T2"."CAJA_CODIGO" AND "T1"."CAJA_NUMERO"="T2"."CAJA_NUMERO")

请记住,解释计划是指示性的,对于使用伪造统计数据的玩具表的计划来说,这个计划会增加一倍,因此请在实际数据结构上对不同方法进行基准测试。