正则表达式或substr或另一种查找字符串的方法

时间:2019-04-09 08:47:44

标签: sql regex string oracle substr

我想达到最佳性能,并且仅在单词“ DL:”之后选择一个“字符串”

我有一列(varchar2),其值:

    DL:1011909825
    Obj:020190004387 DL:8010406429
    Obj:020190004388 DL:8010406428
    DL:190682
    DL:PDL01900940
    Obj:020190004322 DL:611913067

所以输出如下:

    1011909825
    8010406429
    8010406428
    190682
    PDL01900940
    611913067

我不是正则表达式专家,但是我尝试了regexp_replace:

regexp_replace(column,'Obj:|DL:','',1, 0, 'i')

这几乎可以,但是输出仍然不同:

    1011909825
    020190004387 8010406429
    020190004388 8010406428
    190682
    PDL01900940
    020190004322 611913067

我如何解决此问题并达到最佳效果

4 个答案:

答案 0 :(得分:1)

如果数据始终看起来像这样,那么SUBSTR + INSTR可以完成工作:

SQL> with test (col) as
  2    (
  3      select 'DL:1011909825' from dual union all
  4      select 'Obj:020190004387 DL:8010406429' from dual union all
  5      select 'Obj:020190004388 DL:8010406428' from dual union all
  6      select 'DL:190682' from dual union all
  7      select 'DL:PDL01900940' from dual union all
  8      select 'Obj:020190004322 DL:611913067' from dual
  9     )
 10  select col, substr(col, instr(col, 'DL:') + 3) result
 11  from test;

COL                            RESULT
------------------------------ ------------------------------
DL:1011909825                  1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682                      190682
DL:PDL01900940                 PDL01900940
Obj:020190004322 DL:611913067  611913067

6 rows selected.

SQL>

REGEXP_SUBSTR可能看起来像这样:

 <snip>
 10  select col,
 11         ltrim(regexp_substr(col, 'DL:\w+'), 'DL:') resul
 12  from test;

COL                            RESULT
------------------------------ -----------------------------
DL:1011909825                  1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682                      190682
DL:PDL01900940                 PDL01900940
Obj:020190004322 DL:611913067  611913067

如果有很多数据,这应该比正则表达式更快。<​​/ p>

答案 1 :(得分:1)

substr + instr将具有更好的性能,但是如果您想使用regexp:

-- substr + instr will have better performance
with s (str) as (
select 'DL:1011909825' from dual union all
select 'Obj:020190004387 DL:8010406429' from dual union all
select 'Obj:020190004388 DL:8010406428' from dual union all
select 'DL:190682' from dual union all
select 'DL:PDL01900940' from dual union all
select 'Obj:020190004322 DL:611913067' from dual)
select str, regexp_substr(str, 'DL:(.*)', 1, 1, null, 1) rs
from s;

STR                            RS                            
------------------------------ ------------------------------
DL:1011909825                  1011909825                    
Obj:020190004387 DL:8010406429 8010406429                    
Obj:020190004388 DL:8010406428 8010406428                    
DL:190682                      190682                        
DL:PDL01900940                 PDL01900940                   
Obj:020190004322 DL:611913067  611913067                     

6 rows selected.

答案 2 :(得分:1)

您可能会从中得到一些想法。

DL:(.*)

Match 1
1.  1011909825
Match 2
1.  8010406429
Match 3
1.  8010406428
Match 4
1.  190682
Match 5
1.  PDL01900940
Match 6
1.  611913067

https://rubular.com/r/jKjcPs8sPr4Ifn

答案 3 :(得分:1)

或者使用 regexp_substr

with t(str) as
(
 select 'DL:1011909825'                  from dual union all
 select 'Obj:020190004387 DL:8010406429' from dual union all
 select 'Obj:020190004388 DL:8010406428' from dual union all
 select 'DL:190682'                      from dual union all
 select 'DL:PDL01900940'                 from dual union all
 select 'Obj:020190004322 DL:611913067'  from dual 
) 
select regexp_substr(str, '[^DL:]+$') as str
  from t;

STR
----------
1011909825
8010406429
8010406428
190682
01900940
611913067

Demo