Oracle REGEXP_REPLACE并保留部分内容

时间:2017-01-17 05:52:48

标签: regex oracle unicode oracle10g

我在列中有一个文本,例如

Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.

我想使用REGEXP_REPLACE(或任何)函数将%UC#< value >#UC%替换为UNISTR(< value >)。从上面的例子中,结果应该是

Hello World (UNISTR of abc). How are you (UNISTR of def). Have a nice day (UNISTR of ghi).

基本上它应该剥离%UC#并将其中的值替换为值的UNISTR

有没有办法实现这个目标?

1 个答案:

答案 0 :(得分:0)

这可能是11g以及之中的一种方式:

with test(s) as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' from dual)
select listagg (str) within group ( order by lev)
from (
        select regexp_substr(s, '(^|#UC%)(.*?)(%UC#)', 1, level, '', 2) || 
               UPPER(regexp_substr(s, '(%UC#)(.*?)(#UC%)', 1, level, '', 2)) as str,
               level as lev
        from test
        connect by instr(s, '%UC#', 1, level ) > 0
     )

给出了(我使用UPPER代替UNISTR来清除结果):

Hello World ABC. How are you DEF. Have a nice day GHI.

这里的想法是使用常用的分割字符串技术,将'%UC#...#UC%'包裹的部分视为分隔符;请注意,我在输入字符串中添加了一个小字符串('%UC##UC%')来处理输入字符串的最后一部分,使查询将字符串视为以及(空)'%UC#...#UC%'序列结束处理。

在Oracle 10g中,我不能像listagg那样使用regexp_substr,因此,解决方案有点复杂。

这里我根本不使用正则表达式,并通过SYS_CONNECT_BY_PATH计算聚合;要做到这一点,我需要确定一个永远不会出现在输入文本中的字符串,比如说'@@'

with test as ( select 'Hello World %UC#abc#UC%. How are you %UC#def#UC%. Have a nice day %UC#ghi#UC%.' || '%UC##UC%' as s from dual)

测试为(选择&#39; Hello World%UC#abc#UC%。你好%UC#def#UC%。祝你有个美好的一天%UC#ghi#UC%。&#39; | |&#39;%UC ## UC%&#39; as s from dual)

select replace ( sys_connect_by_path (
                              substr(s, case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end, instr(s, '%UC#', 1, level) -case when level = 1 then 1 else instr(s,'#UC%', 1, level-1) +4 end  ) || 
                              UPPER(substr(s, instr(s, '%UC#', 1, level) + 4, instr(s,'#UC%', 1, level) - (instr(s, '%UC#', 1, level) + 4)) )
                            , '@@'
                           ),
                 '@@') str                 
from test
where connect_by_isleaf = 1
connect by instr(s, '%UC#', 1, level ) > 0