我有一个查询,需要花费数小时才能执行,有时甚至不执行。查询如下:
SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0;
查询的结果集
SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null
如下:
ID MULTILIST01
295285 ,3434925,3434442,3436781,
212117 ,3434925,3434442,3436781,
212120 ,3434925,3434442,3436781,
6031650 ,3436781,
.
.
.
在外部查询中,我尝试将每个逗号分隔值作为唯一值。当我执行外部查询时,执行需要几个小时。我尝试过优化它,但没用。
任何想法如何优化它。
Oracle版本信息
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production
解释表格信息
Plan hash value: 4097679000
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1429 | 11432 | 840 (2)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | | |
|* 3 | TABLE ACCESS FULL | PAGE_TWO | 1429 | 11432 | 840 (2)| 00:00:01 |
------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$F5BB74E1
3 - SEL$F5BB74E1 / PAGE_TWO@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TRIM( REGEXP_SUBSTR ("MULTILIST01",'[^,]+',1,LEVEL)) IS NOT NULL)
2 - filter(INSTR("MULTILIST01",',',1,LEVEL-1)>0)
3 - filter("MULTILIST01" IS NOT NULL)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
2 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
3 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020]
表包含225列,其中index仅在主键列(ID,CLASS)上。
此表格为Agile PLM。
答案 0 :(得分:2)
您的方法适用于仅包含一行的表格。
SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 1 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;
ID STR
---------- -------------------------
295285 3434442
295285 3434925
295285 3436781
从两行表开始,您可能得到(可能)更多预期的结果:
SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 2 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;
ID STR
---------- -------------------------
212117 3434442
212117 3434442
212117 3434925
212117 3436781
212117 3436781
212117 3436781
212117 3436781
295285 3434442
295285 3434442
295285 3434925
295285 3436781
295285 3436781
295285 3436781
295285 3436781
;
这个查询的重新制定将得到你(可能)想要的东西。 使用子查询来提供子字符串的索引(1 ..N)。您必须定义将要拆分的子字符串的最大数量。 将此表与您的表联接以有效地将行乘以N.
with substr_idx as (
select rownum colnum from dual connect by level <= 3 /* max number of substrings */)
SELECT id, trim(regexp_substr(str, '[^,]+', 1, colnum)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null), substr_idx
WHERE trim(regexp_substr(str, '[^,]+', 1, colnum)) is not null
order by 1,2;
ID STR
---------- -------------------------
212117 3434442
212117 3434925
212117 3436781
212120 3434442
212120 3434925
212120 3436781
295285 3434442
295285 3434925
295285 3436781
6031650 3436781
如果使用substr / instr提取替换正则表达式,则可以进一步提高(次要)性能。参见例如here
这个故事的一个道理,如果你没有用大数据获得结果尝试使用小数据(检查上面的rownum <= 2
限制)并查看如果结果符合预期。