Oracle PL / SQL - 识别和提取两个字符串之间的匹配单词

时间:2017-09-14 11:44:33

标签: oracle function plsql

我对Oracle Functions相当新,所以为我的天真道歉。

我正在寻找一个函数,将COL_A中的字符串与COL_B中的字符串进行比较,然后将字符串中的任何匹配字段输出到COL_C

e.g。

    因此,
  • COL_A = ‘Microsoft Office’COL_B = ‘Windows Microsoft’ COL_C中的预期结果为‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Office’,因此 COL_C中的预期结果为‘Microsoft Office’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Windows',因此 COL_C中的预期结果为‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Outlook’因此是预期的 COL_C中的结果为NULL

我找到了一个几乎满足要求(Count sequential matching words in two strings oracle)的函数,但是,这个函数输出一个匹配单词的计数,并且只对字顺序也匹配的匹配进行分类(为了我的目的,顺序是无关紧要,我希望显示匹配的单词。

CREATE OR REPLACE FUNCTION STR_WORD_MATCH(
  P_STR1 IN VARCHAR2,
  P_STR2 IN VARCHAR2 )
 RETURN NUMBER
IS
 L_STR1 VARCHAR2(4000) := P_STR1;
 L_STR2 VARCHAR2(4000) := P_STR2;
 L_RES NUMBER DEFAULT 0;
 L_DEL_POS1 NUMBER;
 L_DEL_POS2 NUMBER;
 L_WORD1 VARCHAR2(1000);
 L_WORD2 VARCHAR2(1000);
BEGIN
 LOOP
  L_DEL_POS1 := INSTR(L_STR1,' ');
  L_DEL_POS2 := INSTR(L_STR2,' ');
  CASE L_DEL_POS1
  WHEN 0 THEN
   L_WORD1 := L_STR1;
   L_STR1 := '';
  ELSE
   L_WORD1 := SUBSTR(L_STR1,1,L_DEL_POS1 - 1);
  END CASE;
  CASE L_DEL_POS2
  WHEN 0 THEN
   L_WORD2 := L_STR2;
   L_STR2 := '';
  ELSE
   L_WORD2 := SUBSTR(L_STR2,1,L_DEL_POS2 - 1);
  END CASE;
  EXIT
 WHEN (L_WORD1 <> L_WORD2) OR ((L_WORD1 IS NULL) OR (L_WORD2 IS NULL));
  L_RES := L_RES + 1;
  L_STR1 := SUBSTR(L_STR1,L_DEL_POS1 + 1);
  L_STR2 := SUBSTR(L_STR2,L_DEL_POS2 + 1);
 END LOOP;
RETURN L_RES;
END;

与往常一样,任何帮助都将不胜感激。

1 个答案:

答案 0 :(得分:0)

您可以在一个查询中执行此操作,但为了简化语法,我创建了用于拆分单词的简短函数。接下来我使用了这个函数和listagg()

select rn, max(c1) c1, max(c2) c2, 
       listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common
  from (select rownum rn, c1, c2 from t) t 
  cross join table(split(c1)) t1 
   left join table(split(c2)) t2 on t2.column_value = t1.column_value
  group by rn

功能:

create or replace function split(i_str in varchar2) 
  return sys.odcivarchar2list pipelined is
begin
  for i in 1..length(' '||regexp_replace(i_str, '[^ ]+')) loop
    pipe row (regexp_substr(i_str, '[^ ]+', 1, i));
  end loop;
end;

示例:

with t (c1, c2) as (
    select 'Microsoft Office', 'Windows Microsoft' from dual union all
    select 'Microsoft Office', 'Microsoft Office'  from dual union all
    select 'Microsoft Office', 'Microsoft Windows' from dual union all
    select 'Microsoft Office', 'Outlook' from dual )
select rn, max(c1) c1, max(c2) c2, 
       listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common
  from (select rownum rn, c1, c2 from t) t 
  cross join table(split(c1)) t1 
   left join table(split(c2)) t2 on t2.column_value = t1.column_value
  group by rn

结果:

        RN C1               C2                COMMON
---------- ---------------- ----------------- -----------------
         1 Microsoft Office Windows Microsoft Microsoft
         2 Microsoft Office Microsoft Office  Microsoft Office
         3 Microsoft Office Microsoft Windows Microsoft
         4 Microsoft Office Outlook