Oracle 11g通过正则表达式获取所有匹配的匹配项

时间:2013-07-11 14:40:01

标签: regex oracle

我正在使用Oracle 11g,我想使用REGEXP_SUBSTR来匹配给定模式的所有事件。 例如

 SELECT
  REGEXP_SUBSTR('Txa233141b Ta233141 Ta233142 Ta233147 Ta233148',
  '(^|\s)[A-Za-z]{2}[0-9]{5,}(\s|$)') "REGEXP_SUBSTR"
  FROM DUAL;

仅返回第一个匹配Ta233141,但我想返回与正则表达式匹配的其他匹配项,即Ta233142 Ta233147 Ta233148。

5 个答案:

答案 0 :(得分:18)

REGEXP_SUBSTR只返回一个值。您可以将字符串转换为伪表,然后查询匹配项。有一种基于XML的方法可以让我现在逃避,但只要你只有一个源字符串,就可以使用connect-by工作:

SELECT REGEXP_SUBSTR(str, '[^ ]+', 1, LEVEL) AS substr
FROM (
    SELECT 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' AS str FROM DUAL
)
CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(str, '[^ ]+')) + 1;

...给你:

SUBSTR             
--------------------
Txa233141b           
Ta233141             
Ta233142             
Ta233147            
Ta233148            

...您可以使用原始模式稍微简单的版本来过滤它:

SELECT substr
FROM (
    SELECT REGEXP_SUBSTR(str, '[^ ]+', 1, LEVEL) AS substr
    FROM (
        SELECT 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' AS str
        FROM DUAL
    )
    CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(str, '[^ ]+')) + 1
)
WHERE REGEXP_LIKE(substr, '^[A-Za-z]{2}[0-9]{5,}$');

SUBSTR             
--------------------
Ta233141             
Ta233142             
Ta233147             
Ta233148             

这不是很漂亮,但在一个字段中都没有多个值。

答案 1 :(得分:1)

添加一个循环并返回所有值的函数怎么样?

create or replace function regexp_substr_mr (
  p_data clob,
  p_re varchar
  )
return varchar as
  v_cnt number;
  v_results varchar(4000);
begin
  v_cnt := regexp_count(p_data, p_re, 1,'m');
  if v_cnt < 25 then
    for i in 1..v_cnt loop
      v_results := v_results || regexp_substr(p_data,p_re,1,i,'m') || chr(13) || chr(10);
    end loop;
  else 
    v_results := 'WARNING more than 25 matches found';
  end if;

  return v_results;
end;

然后只需将该函数作为select查询的一部分调用。

答案 2 :(得分:1)

我正在修复@Alex Poole answer以获得多线源支持并更快地执行:

with templates as (select '\w+' regexp from dual)
select 
    regexp_substr(str, templates.regexp, 1, level) substr
from (
    select 1 id, 'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' as str from dual
    union
    select 2 id, '2 22222222222222Ta233141 2Ta233142 2Ta233147' as str from dual
    union
    select 3 id, '3Txa233141b 3Ta233141 3Ta233142' as str from dual
)
join templates on 1 = 1
connect by 
    id = connect_by_root id
    and regexp_instr(str, templates.regexp, 1, level) > 0
order by id, level

来源:

ID  STR                                             
--  ----------------------------------------------  
1   Txa233141b Ta233141 Ta233142 Ta233147 Ta233148  
2   2 22222222222222Ta233141 2Ta233142 2Ta233147    
3   3Txa233141b 3Ta233141 3Ta233142                 

结果:

Txa233141b              
Ta233141                
Ta233142                
Ta233147                
Ta233148                
2                       
22222222222222Ta233141  
2Ta233142               
2Ta233147               
3Txa233141b             
3Ta233141               
3Ta233142               

答案 3 :(得分:0)

基于@David E. Veliev answer,这是对多行输入的查询。如果以下查询对您有用,请考虑投票original answer

SELECT SUBSTR
  FROM (WITH TEMPLATES AS (SELECT '\w+' REGEXP FROM DUAL)
         SELECT ID,
                CONNECT_BY_ROOT ID CBR,
                LEVEL LVL,
                REGEXP_SUBSTR(STR, TEMPLATES.REGEXP, 1, LEVEL) SUBSTR
           FROM (SELECT 1 ID,
                        'Txa233141b Ta233141 Ta233142 Ta233147 Ta233148' AS STR
                   FROM DUAL
                 UNION
                 SELECT 2 ID,
                        '2 22222222222222Ta233141 2Ta233142 2Ta233147' AS STR
                   FROM DUAL
                 UNION
                 SELECT 3 ID,
                        '3Txa233141b 3Ta233141 3Ta233142' AS STR
                   FROM DUAL)
           JOIN TEMPLATES
             ON 1 = 1
         CONNECT BY REGEXP_INSTR(STR, TEMPLATES.REGEXP, 1, LEVEL) > 0)
          WHERE ID = CBR
          GROUP BY ID,
                   CBR,
                   LVL,
                   SUBSTR
          ORDER BY ID,
                   LVL;

输入:

ID  STR
==  ==============================================
1   Txa233141b Ta233141 Ta233142 Ta233147 Ta233148
2   2 22222222222222Ta233141 2Ta233142 2Ta233147
3   3Txa233141b 3Ta233141 3Ta233142

输出:

SUBSTR
======================
Txa233141b
Ta233141
Ta233142
Ta233147
Ta233148
2
22222222222222Ta233141
2Ta233142
2Ta233147
3Txa233141b
3Ta233141
3Ta233142

答案 4 :(得分:-1)

以下是您问题的简单解决方案。

SELECT REGEXP_SUBSTR('Txa233141b Ta233141 Ta233142 Ta233147 Ta233148',
  '([a-zA-Z0-9]+\s?){1,}') "REGEXP_SUBSTR"
  FROM DUAL;