Oracle INSTR完全匹配

时间:2015-03-09 10:52:21

标签: sql regex oracle substring

我有以下查询

SQL> select * from RTECS_ABBREV ra
  2  where instr(trim('100 mmol/plate (-S9)'), ra.abbrev) > 0;

ABBREV                         DEFINITION
------------------------------ --------------------------------------------------------------------------------
mmo                            Mutation in Micro-organism
mmol                           millimole
mol                            mole
S                              second

SQL> 

我想得到以下结果

SQL> select * from RTECS_ABBREV ra
  2  where instr(trim('100 mmol/plate (-S9)'), ra.abbrev) > 0;

ABBREV                         DEFINITION
------------------------------ --------------------------------------------------------------------------------
mmol                           millimole
S                              second

SQL> 

因为“mmo”和“mol”是“mmol”字的一部分

更多....

请参阅我有以下数据:

with abbr as
(
      select 'mmo' as abbrev from dual union 
      select 'mmol' as abbrev from dual union
      select 'mol' as abbrev from dual union
      select 'ug' as abbrev from dual union
      select 'mg' as abbrev from dual union
      select 'ppm' as abbrev from dual union
      select 'nmol' as abbrev from dual union
      select 'nm' as abbrev from dual union
      select 'ol' as abbrev from dual union
      select 'S' as abbrev from dual

),
main_data  as
(
select '24231' as id_, '10 ug/plate (-S9)' as data_ from dual union 
select '24232' as id_, '1 pph' as data_ from dual union 
select '24233' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24234' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24235' as id_, '1 pph' as data_ from dual union 
select '24236' as id_, '19300 nmol/L (-S9)' as data_ from dual union 
select '24237' as id_, '800 mg/L' as data_ from dual union 
select '24238' as id_, '600 ppm/2H-C (-S9)' as data_ from dual union 
select '24239' as id_, '500 mg/L (-S9)' as data_ from dual union 
select '24240' as id_, '2000 ppm (-S9)' as data_ from dual union 
select '24241' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24242' as id_, '1 pph (-S9)' as data_ from dual union 
select '24243' as id_, 'ihl 2700 ppm' as data_ from dual union 
select '24244' as id_, 'par 10 mmol/L' as data_ from dual union 
select '24245' as id_, 'mul 1 pph/8H-C' as data_ from dual                          
)
select * from main_data

我需要在“main_data.data_”中替换“abbr.abbrev”中与另一个字符串匹配的任何匹配单词(例如:“test”)。

因此,例如“100 mmol / plate(-S9)”我需要:

100 test/plate (-test9) but not,

100 testl/plate (-test9) or 100 testol/plate (-test9)

所以规则似乎是,在“abbr.abbrev”中替换整个单词匹配,如果string在()之间替换任何匹配的字符

1 个答案:

答案 0 :(得分:1)

根据您的示例数据,我认为您需要以下内容:

SELECT * FROM main_data INNER JOIN abbr
    ON REGEXP_LIKE(main_data.data_, '(^|\W)' || abbr.abbrev || '(\W|$)');

我使用上面的正则表达式,因为Oracle正则表达式不支持单词边界。在第一组中,我正在检查字符串的开头或“非单词”字符(既不是字母数字也不是下划线_)。在后一个(结束)组中,我正在检查字符串或非单词字符的结尾。

如果您总是要对给定单位进行一些测量,那么检查字符串的开头(锚^)并不是必需的。

如果您要进行替换,则需要将REGEXP_REPLACE()与上述正则表达式一起使用,而不是仅使用REGEXP_LIKE()