Oracle SQL在括号中获取最后一个字符串(也可能包括括号内)

时间:2016-08-29 19:31:28

标签: sql regex oracle substring

我正在使用此查询:

SELECT strain.id, TRIM(SUBSTR(strain.name, 1, INSTR(strain.name, '[')-1)) AS name
FROM species_strain strain

上面的查询给出了类似以下内容:

id    name
-----------------------------------------------
100   CfwHE3 (HH3d) Jt1 (CD-1)
101   4GSdg-3t 22sfG/J (mdx (fq) KO)
102   Yf7mMjfel 7(tm1) (SCID)
103   B29fj;jfos x11 (tmos (line x11))
104   B29;CD (Atm (line G5))
105   Ifkso30 jel-3
106   13GupSip (te3x) Blhas/J           --------> I don't want to get (te3x)

我需要一个正则表达式,它会给我最后一组括号的内容(可能或不可能包含一个或多个括号中的一组括号) - 这需要是在字符串的末尾。如果它在字符串的中间,那我就不想要了。

我想得的是以下内容:

(CD-1)
(mdx (fq) KO)
(SCID)
(tmos (line x11))
(Atm (line G5))

因此,如果我复制并粘贴整个查询,我就会这样做,但这并没有考虑到内部的括号:

SELECT DISTINCT REGEXP_SUBSTR(strain.name, '\(.*?\)', 1, REGEXP_COUNT(strain.name, '\(.*?\)')) AS name
FROM (
  SELECT strain.id, TRIM(SUBSTR(strain.name, 1, INSTR(strain.name, '[')-1)) AS name
  FROM species_strain strain
) strain
WHERE INSTR(strain.name, '(', 1, 1) > 0

查询以某种方式工作,但是如果我在主要内部得到另一组括号,它会中断并丢失一些数据。它返回类似于:

(CD-1)
(mdx (fq)          ---------> missing KO)
(SCID)
(tmos (line x11)   ---------> missing )
(Atm (line G5)     ---------> missing )

附加要求

我忘了提到我需要的括号集应该在最后。如果之后还有其他字符,那么我就不想要了。我在我的例子中添加了另一行。

5 个答案:

答案 0 :(得分:1)

您应该注意到Oracle正则表达式不像PCRE或.NET正则表达式那么强大。因此,您只能使用正则表达式来匹配指定的嵌套括号级别。

以下正则表达式将匹配字符串中的最后一个括号与1个嵌套括号级别:

\([^()]*(\([^()]*\)[^()]*)*\)$

请参阅regex demo

此正则表达式不匹配Test (get (me) ((me), too))之类的字符串。我提供的模式将使用1个嵌套级别,它可以增强到支持2个级别,它将比现在更难以读取。

详细

  • \( - (
  • [^()]* - 除()以外的零个或多个字符
  • (\([^()]*\)[^()]*)* - 零次或多次出现:
    • \(一个(
    • [^()]* - 除()以外的零个或多个字符
    • \) - 结束)
    • [^()]* - 除()以外的零个或多个字符
  • \) - 结束)
  • $ - 字符串的结尾。

使用

regexp_substr(col_name, '\([^()]*(\([^()]*\)[^()]*)*\)$', 1, 1)

答案 1 :(得分:1)

如果创建一个功能,则以下功能可以完成工作:

create or replace
function fn_pars(p_text in varchar2) return varchar2 deterministic as 
  n_count pls_integer := 0;
begin
  if p_text is null or instr(p_text, ')', -1) = 0
    or p_text not like '%)' then
    return null;
  end if;
  for i in reverse 1..length(p_text) loop
    case substr(p_text, i, 1) 
      when ')' then n_count := n_count + 1;
      when '(' then n_count := n_count - 1;
      else null;
    end case;
    if n_count = 0 then 
      return substr(p_text, i);
    end if;
  end loop;
  return p_text;
end fn_pars;

然后你可以测试它:

select text,
       fn_pars(text)
  from (
          select 'B29fj;jfos x11 (tmos (line x11)) abc' text from dual union all
          select 'B29fj;j(fos) x11 (tmos (line x11))' text from dual union all
          select 'B29fj;j(fos) x11 (t(mo)s (line x11))' text from dual union all
          select '' text from dual union all
          select 'no parentheses' text from dual
       )

结果:

Text                                    fn_pars(text)
-----------------------------------------------------
B29fj;jfos x11 (tmos (line x11)) abc   (null)
B29fj;j(fos) x11 (tmos (line x11))     (tmos (line x11))
B29fj;j(fos) x11 (t(mo)s (line x11))   (t(mo)s (line x11))
(null)                                 (null)
no parentheses                         (null)

其中(null)表示没有值。 :)

该功能支持任何级别的嵌套。您也可以在同一级别嵌套多个括号。

答案 2 :(得分:1)

下面的解决方案使用纯SQL(无过程/函数);它适用于任何级别的嵌套括号和“同级”括号;每当输入为null时它返回null,或者它不包含任何右括号,或者它包含右括号但最右边的右括号是不平衡的(没有左括号,在这个最右边的括号的左边,这样对是平衡的)。

在最底部,我将显示返回“结果”所需的微调,只有当最右边的右括号是输入字符串中的最后一个字符时,否则返回null。 这是OP的编辑要求

我创建了几个用于测试的输入字符串。特别注意id = 156,智能解析器不会“计算”字符串文字中的括号,或者以某种其他方式不是“正常”括号。我的解决方案并没有走得那么远 - 它对所有的括号都是一样的。

策略是从最右边的右括号(如果有至少一个)的位置开始,并从那里向左移动,一步一步,只通过左括号(如果有的话)和测试括号是否平衡。通过比较删除所有)后“测试字符串”的长度与删除所有(后的长度,可以轻松完成此操作。

奖励:我能够使用“标准”(非正则表达式)字符串函数编写没有正则表达式的解决方案。这应该有助于保持它的快速。

<强>查询

with
     species_str ( id, name) as (
       select 100, 'CfwHE3 (HH3d) Jt1 (CD-1)'         from dual union all
       select 101, '4GSdg-3t 22sfG/J (mdx (fq) KO)'   from dual union all
       select 102, 'Yf7mMjfel 7(tm1) (SCID)'          from dual union all
       select 103, 'B29fj;jfos x11 (tmos (line x11))' from dual union all
       select 104, 'B29;CD (Atm (line G5))'           from dual union all
       select 105, 'Ifkso30 jel-3'                    from dual union all
       select 106, '13GupSip (te3x) Blhas/J'          from dual union all
       select 151, ''                                 from dual union all
       select 152, 'try (this (and (this))) ok?'      from dual union all
       select 153, 'try (this (and (this)) ok?)'      from dual union all
       select 154, 'try (this (and) this (ok))?'      from dual union all
       select 155, 'try (this (and (this)'            from dual union all
       select 156, 'right grouping (includging ")")'  from dual union all
       select 157, 'try this out ) ( too'             from dual
     ),
     prep ( id, name, pos ) as (
       select id, name, instr(name, ')', -1)
       from   species_str
     ),
     rec ( id, name, str, len, prev_pos, new_pos, flag ) as (
       select  id, name, substr(name, 1, instr(name, ')', -1)),
               pos, pos - 1, pos, null
         from  prep
       union all
       select  id, name, str, len, new_pos,
               instr(str, '(',  -(len - new_pos + 2)),
               case when length(replace(substr(str, new_pos), '(', '')) =
                         length(replace(substr(str, new_pos), ')', ''))
                    then 1 end
         from  rec
         where prev_pos > 0 and flag is null
     )
select   id, name, case when flag = 1 
              then substr(name, prev_pos, len - prev_pos + 1) end as target
from     rec
where    flag = 1 or prev_pos <= 0 or name is null
order by id;

<强>输出

        ID NAME                             TARGET                         
---------- -------------------------------- --------------------------------
       100 CfwHE3 (HH3d) Jt1 (CD-1)         (CD-1)                          
       101 4GSdg-3t 22sfG/J (mdx (fq) KO)   (mdx (fq) KO)                   
       102 Yf7mMjfel 7(tm1) (SCID)          (SCID)                          
       103 B29fj;jfos x11 (tmos (line x11)) (tmos (line x11))               
       104 B29;CD (Atm (line G5))           (Atm (line G5))                 
       105 Ifkso30 jel-3                                                    
       106 13GupSip (te3x) Blhas/J          (te3x)                          
       151                                                                  
       152 try (this (and (this))) ok?      (this (and (this)))             
       153 try (this (and (this)) ok?)      (this (and (this)) ok?)         
       154 try (this (and) this (ok))?      (this (and) this (ok))          
       155 try (this (and (this)            (this)                          
       156 right grouping (includging ")")                                  
       157 try this out ) ( too                                             

 14 rows selected 

满足OP(已编辑)要求所需的更改

在最外面的select(代码底部),我们case when flag = 1 then...定义target列的位置,添加如下条件:

... , case when flag = 1 and len = length(name) then ...
通过此修改

输出

        ID NAME                             TARGET                         
---------- -------------------------------- --------------------------------
       100 CfwHE3 (HH3d) Jt1 (CD-1)         (CD-1)                          
       101 4GSdg-3t 22sfG/J (mdx (fq) KO)   (mdx (fq) KO)                   
       102 Yf7mMjfel 7(tm1) (SCID)          (SCID)                          
       103 B29fj;jfos x11 (tmos (line x11)) (tmos (line x11))               
       104 B29;CD (Atm (line G5))           (Atm (line G5))                 
       105 Ifkso30 jel-3                                                    
       106 13GupSip (te3x) Blhas/J                                          
       151                                                                  
       152 try (this (and (this))) ok?                                      
       153 try (this (and (this)) ok?)      (this (and (this)) ok?)         
       154 try (this (and) this (ok))?                                      
       155 try (this (and (this)            (this)                          
       156 right grouping (includging ")")                                  
       157 try this out ) ( too                                             

 14 rows selected 

答案 3 :(得分:0)

如果您只有一个嵌套,则以下内容应该有效:

var diagonal = d3.svg.diagonal()
        .projection(function(d) { return [d.y, d.x]; });

这满足了您问题中的所有示例,但它不允许任意嵌套括号。

答案 4 :(得分:0)

这适用于任何深度的嵌套括号:

Oracle安装程序

CREATE TABLE species_strain ( id, name ) AS
SELECT 100,   'CfwHE3 (HH3d) Jt1 (CD-1)' FROM DUAL UNION ALL
SELECT 101,   '4GSdg-3t 22sfG/J (mdx (fq) KO)' FROM DUAL UNION ALL
SELECT 102,   'Yf7mMjfel 7(tm1) (SCID)' FROM DUAL UNION ALL
SELECT 103,   'B29fj;jfos x11 (tmos (line x11))' FROM DUAL UNION ALL
SELECT 104,   'B29;CD (Atm (line G5))' FROM DUAL UNION ALL
SELECT 105,   'Ifkso30 jel-3' FROM DUAL UNION ALL
SELECT 106,   'data (1 (2 (333 (444) 3 (4 (5) 4) 3) 2) 1)' FROM DUAL;

查询1

WITH tmp ( id, name, pos, depth ) AS (
  SELECT id,
         name,
         LENGTH( name ),
         1
  FROM   species_strain
  WHERE  SUBSTR( name, -1 ) = ')'
UNION ALL
  SELECT id,
         name,
         pos - 1,
         depth + CASE SUBSTR( name, pos - 1, 1 )
                 WHEN '(' THEN -1
                 WHEN ')' THEN +1
                          ELSE 0 END
  FROM   tmp
  WHERE  (  depth > 1
         OR SUBSTR( name, pos -1, 1 ) <> '(' )
  AND    pos > 0
)
SELECT id,
       MAX( name ) AS name,
       MIN( SUBSTR( name, pos - 1 ) ) KEEP ( DENSE_RANK FIRST ORDER BY pos )
         AS bracket
FROM   tmp
GROUP BY id;

查询2

SELECT id,
       name,
       substr( name, start_pos ) AS bracket
FROM   (
  SELECT id,
         name,
         LAG( CASE WHEN bracket = '(' AND depth = 1 THEN pos END )
           IGNORE NULLS OVER ( PARTITION BY id ORDER BY ROWNUM )
           AS start_pos,
         pos AS end_pos,
         bracket,
         depth
  FROM   (
    SELECT id,
           name,
           COLUMN_VALUE AS pos,
           SUBSTR( name, column_value, 1 ) AS bracket,
           SUM( CASE SUBSTR( name, column_value, 1 ) WHEN '(' THEN 1 ELSE -1 END )
             OVER ( PARTITION BY id ORDER BY ROWNUM ) AS depth
    FROM   species_strain s,
           TABLE(
             CAST(
               MULTISET(
                 SELECT REGEXP_INSTR( s.name, '[()]', 1, LEVEL )
                 FROM   DUAL
                 CONNECT BY LEVEL <= REGEXP_COUNT( s.name, '[()]' )
               ) AS SYS.ODCINUMBERLIST
             )
           ) t
    WHERE  SUBSTR( s.name, -1 ) = ')'
  )
)
WHERE bracket = ')'
AND   end_pos = LENGTH( name );

<强>结果:

        ID NAME                                       BRACKET                                  
---------- ------------------------------------------ ------------------------------------------
       100 CfwHE3 (HH3d) Jt1 (CD-1)                   (CD-1)                                     
       101 4GSdg-3t 22sfG/J (mdx (fq) KO)             (mdx (fq) KO)                              
       102 Yf7mMjfel 7(tm1) (SCID)                    (SCID)                                     
       103 B29fj;jfos x11 (tmos (line x11))           (tmos (line x11))                          
       104 B29;CD (Atm (line G5))                     (Atm (line G5))                            
       106 data (1 (2 (333 (444) 3 (4 (5) 4) 3) 2) 1) (1 (2 (333 (444) 3 (4 (5) 4) 3) 2) 1)