Question

我有一个查询，需要花费数小时才能执行，有时甚至不执行。查询如下：

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0;

查询的结果集

SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null

如下：

ID       MULTILIST01 
295285  ,3434925,3434442,3436781,
212117  ,3434925,3434442,3436781,
212120  ,3434925,3434442,3436781,
6031650 ,3436781,
.
.
.

在外部查询中，我尝试将每个逗号分隔值作为唯一值。当我执行外部查询时，执行需要几个小时。我尝试过优化它，但没用。

任何想法如何优化它。

Oracle版本信息

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0  Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

解释表格信息

Plan hash value: 4097679000

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |  1429 | 11432 |   840   (2)| 00:00:01 |
|*  1 |  FILTER                       |          |       |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|          |       |       |            |          |
|*  3 |    TABLE ACCESS FULL          | PAGE_TWO |  1429 | 11432 |   840   (2)| 00:00:01 |
------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$F5BB74E1
   3 - SEL$F5BB74E1 / PAGE_TWO@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TRIM( REGEXP_SUBSTR ("MULTILIST01",'[^,]+',1,LEVEL)) IS NOT NULL)
   2 - filter(INSTR("MULTILIST01",',',1,LEVEL-1)>0)
   3 - filter("MULTILIST01" IS NOT NULL)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   2 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   3 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020]

表包含225列，其中index仅在主键列（ID，CLASS）上。

此表格为Agile PLM。

Answer 1

您的方法适用于仅包含一行的表格。

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 1 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;

        ID STR                     
---------- -------------------------
    295285 3434442                   
    295285 3434925                   
    295285 3436781

从两行表开始，您可能得到（可能）更多预期的结果：

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 2 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;   

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    295285 3434442                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781        
;

这个查询的重新制定将得到你（可能）想要的东西。使用子查询来提供子字符串的索引（1 ..N）。您必须定义将要拆分的子字符串的最大数量。将此表与您的表联接以有效地将行乘以N.

with substr_idx as (
select  rownum colnum from dual connect by level <= 3 /*  max  number of substrings */)   
SELECT id, trim(regexp_substr(str, '[^,]+', 1, colnum)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null), substr_idx
WHERE trim(regexp_substr(str, '[^,]+', 1, colnum)) is not null
order by 1,2;  

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212120 3434442                   
    212120 3434925                   
    212120 3436781                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
   6031650 3436781

如果使用substr / instr提取替换正则表达式，则可以进一步提高（次要）性能。参见例如here

这个故事的一个道理，如果你没有用大数据获得结果尝试使用小数据（检查上面的rownum <= 2限制）并查看如果结果符合预期。

查询需要特别长的时间才能执行

1 个答案: