在oracle中使用regexp_replace替换定义模式后的字符

时间:2017-04-20 11:57:27

标签: oracle regexp-replace

我想用oracle中的函数regexp_replace替换字符串中的单个字符。字符串中的替换应该从定义的模式开始。

示例:

在字符串“Heyho || HeyheyHo”中,我将替换模式“||”后面的所有“y”字符字符“i”。应忽略在模式之前出现的字符。

字符串:

Heyho || HeyheyHo

替换后的字符串:

Heyho || HeiheiHo

你肯定很容易吗?

2 个答案:

答案 0 :(得分:1)

你不需要正则表达式;您可以使用INSTRSUBSTRREPLACE来满足您的需求:

with test(s) as (
        select 'Heyho || HeyheyHo' from dual
    )
    /* the query */
    select s as input,
           substr(s, 1, instr(s, '||')+1) || 
           replace( substr(s, instr(s, '||')+2), 'y', 'i') as result
    from test

给出:

INPUT             RESULT
----------------- --------------------
Heyho || HeyheyHo Heyho || HeiheiHo

工作原理:

select s as input,
       substr(s, 1, instr(s, '||')+1) beforeDelimiter,
       substr(s, instr(s, '||')+2)   afterDelimiter,
       replace( substr(s, instr(s, '||')+2), 'y', 'i') afterDelimiterEdited,
       substr(s, 1, instr(s, '||')+1) || 
       replace( substr(s, instr(s, '||')+2), 'y', 'i') as result
from test

给出:

INPUT             BEFOREDELI AFTERDELIM AFTERDELIM RESULT
----------------- ---------- ---------- ---------- --------------------
Heyho || HeyheyHo Heyho ||    HeyheyHo   HeiheiHo  Heyho || HeiheiHo

如果字符串中出现多个||replace将在第一次出现后修改字符。

根据Mathguy的评论,我不能说这个解决方案比正则表达式更快。<​​/ p>

使用regexp的解决方案可能是:

select regexp_replace(s, 'y', 'i', instr(s, '||') ) as result

以下是使用相同数据(5百万行)以相同方式创建的2个表的小型性能测试:

SQL> create table testA3(s) as
  2  select regexp_replace(s, 'y', 'i', instr(s, '||') ) as result
  3  from testA;

Table created.

Elapsed: 00:00:30.75
SQL> create table testB3(s) as
  2  select substr(s, 1, instr(s, '||')+1) ||
  3             replace( substr(s, instr(s, '||')+2), 'y', 'i') as result
  4  from testB;

Table created.

Elapsed: 00:00:14.82

标准方法似乎更快;使用3M行的相同测试对于正则表达式方法需要18秒,对于标准方法需要7秒。

测试当然不是详尽无遗的,结果可能会根据很多事情而改变,但是即使在这种情况下需要许多标准操作才能将标准方法视为regexp的良好替代方案。正则表达式的结果。

以下是3M行的完整测试;我做了CREATE和2 INSERT来避免CONNECT BY内存问题,并且级别非常高。

此外,在3M和5M行测试之间,我删除了表并再次创建它们,以确保缓存不会影响结果。

SQL> create table testA(s) as
  2  select 'Heyho || HeyheyHo' || level || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

Table created.

SQL> create table testB(s) as
  2  select 'Heyho || HeyheyHo' || level || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

Table created.

SQL> insert into testB(s)
  2  select 'Heyho || HeyheyHo' || to_char(level + 1000000) || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

1000000 rows created.

SQL> insert into testA(s)
  2  select 'Heyho || HeyheyHo' || to_char(level + 1000000)  || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

1000000 rows created.

SQL> insert into testB(s)
  2  select 'Heyho || HeyheyHo' || to_char(level + 2000000)  || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

1000000 rows created.

SQL> insert into testA(s)
  2  select 'Heyho || HeyheyHo' || to_char(level + 2000000)  || 'HeyheyHo'
  3  from dual
  4  connect by level <= 1000000;

1000000 rows created.

SQL> select count(1), count(distinct s) from testA;

  COUNT(1) COUNT(DISTINCTS)
---------- ----------------
   3000000          3000000

SQL> select count(1), count(distinct s) from testB;

  COUNT(1) COUNT(DISTINCTS)
---------- ----------------
   3000000          3000000

SQL> set timing on
SQL> create table testA2(s) as
  2  select regexp_replace(s, 'y', 'i', instr(s, '||')+2 ) as result
  3  from testA;

Table created.

Elapsed: 00:00:17.66
SQL> create table testB2(s) as
  2  select substr(s, 1, instr(s, '||')+1) ||
  3             replace( substr(s, instr(s, '||')+2), 'y', 'i') as result
  4  from testB;

Table created.

Elapsed: 00:00:06.96
SQL>

答案 1 :(得分:1)

这是使用regexp_replace的解决方案。第四个参数是起始位置。经过一番思考后,我决定不跳过&#39; + 2&#39;。不要懒惰,浪费周期测试你知道不是目标角色的角色。

SQL> with tbl(str) as (
     select 'Heyho || HeyheyHo' from dual
   )
   select str before,
          regexp_replace(str, 'y', 'i', instr(str, '||')+2) after
   from tbl;

BEFORE            AFTER
----------------- -----------------
Heyho || HeyheyHo Heyho || HeiheiHo

SQL>