Oracle PL / SQL中的字符串删除

时间:2014-08-26 11:20:19

标签: sql oracle replace plsql

我需要从输入字符串中删除某些关键字并返回新字符串。关键字存储在另一个表中,如MR,MRS,DR,PVT,PRIVATE,CO,COMPANY,LTD,LIMITED等。它们是两种关键词LEADING - MR,MRS,DR和TRAILING - PVT,PRIVATE,CO,COMPANY,如果关键字是LEADING那么我们必须从头开始删除它,如果它是TRAILING那么我们必须从最后删除它。例如,MR Jones MRS COMPANY应该返回JONES MRSMR MRS Jones PVT COMPANY应该返回JONES(如在第一次迭代中MRPVT将被修剪,然后变为MRS JONES PVT,在第二次迭代中,它将变为JONES。同样MR MRS Doe PVT COMPANY LTD将最终返回DOE

我必须通过PL / SQL来完成。我编写了以下代码,但如果在开头或结尾有多个关键字,则删除所有关键字。我循环关键字光标的原因,如果关键字不在最后并且循环已经迭代,那么我们就不能重复使用该关键字来替换。请注意,在结尾或开头可能没有关键字:

CREATE OR REPLACE FUNCTION replace_keyword (p_in_name IN VARCHAR2)
   RETURN VARCHAR2
IS
   l_name   VARCHAR2 (4000);

   CURSOR c
   IS
      SELECT *
        FROM RSRV_KEY_LKUPS
       WHERE ACTIVE = 'Y';
BEGIN
    l_name := TRIM (p_in_name); 

   --Now inside the function we’ll loop through this cursor something like below and replace the value in the input name:

   FOR rec IN c
   LOOP
      IF     UPPER (rec.POSITION) = 'LEADING'
         AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) > 0
      THEN                                        --Rule 3:remove leading name
         DBMS_OUTPUT.PUT_LINE ('Value >>' || rec.KEY_WORD);
         l_name := LTRIM (UPPER (l_name), rec.KEY_WORD || ' ');

      ELSIF     UPPER (rec.POSITION) = 'TRAILING'
            AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) > 0
      THEN                                       --Rule 4:remove trailing name
         DBMS_OUTPUT.PUT_LINE ('Value >>' || rec.KEY_WORD);
         l_name := RTRIM (UPPER (l_name), ' ' || rec.KEY_WORD);      
      END IF;

      l_name := l_name;
   END LOOP;

   l_name := REGEXP_REPLACE (l_name, '[[:space:]]{2,}', ' '); --Remove multiple spaces in a word and replace with single blank space
   l_name := TRIM (l_name); --Remove the leading and trailing blank spaces
   RETURN l_name;
EXCEPTION
   WHEN OTHERS
   THEN
      raise_application_error (
         -20001,
         'An error was encountered - ' || SQLCODE || ' -ERROR- ' || SQLERRM);
END;
/

非常感谢您提供任何帮助。

修改 样本输入1

MR MRS Jones PVT COMPANY 

输出

JONES

示例输入2

MR MRS Doe PVT COMPANY LTD 

输出

DOE

2 个答案:

答案 0 :(得分:1)

如果您想确定在开头找到了主要关键字,则只应在INSTR返回1时将其删除:

替换

IF UPPER (rec.POSITION) = 'LEADING'
   AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) > 0

IF UPPER (rec.POSITION) = 'LEADING'
   AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) = 1

并替换

  ELSIF     UPPER (rec.POSITION) = 'TRAILING'
        AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) > 0

通过

  ELSIF UPPER (rec.POSITION) = 'TRAILING'
        AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) = (LENGTH(l_name)-LENGTH(rec.key_word) +1)

对于多个关键字的问题,你必须绕过for循环:

keyword_found BOOLEAN;
LOOP
  keyword_found = false;
  FOR rec IN c
       -- when you find a keyword
       keyword_found := true;
  END LOOP;
  EXIT WHEN NOT(keyword_found);
END LOOP;

答案 1 :(得分:1)

我认为可以使用单个查询完成(如果你出于某种原因,可以将其包装在plsql函数中):

Here is a sqlfiddle demo

with inpt as (select 'MR Jones MRS COMPANY' text from dual)
select listagg(t1.word, ' ') within group (order by ord) new_text 
from (
select w.*, words.*, 
sum(case when nvl(POSITION, 'TRAILING') = 'TRAILING'  then 1 else 0 end) over(order by ord rows between unbounded preceding and current row) l,
sum(case when nvl(POSITION, 'LEADING') = 'LEADING' then 1 else 0 end) over(order by ord desc rows between unbounded preceding and current row) t
from 
(select regexp_substr(inpt.text, '[^ ]+',1,level) word , level ord 
from inpt 
connect by level <= regexp_count(inpt.text, ' ') + 1) words left outer join RSRV_KEY_LKUPS w on w.KEY_WORD = words.word
 ) t1
where t1.t > 0 and t1.l > 0

修改:解释:

'with'子句只是将输入字符串作为列(不是必需的)。

将别名称为“单词”的内部选择是用于将单词拆分为行的已知技术(请注意,我保留了ord列的顺序)。

现在我们可以使用表'RSRV_KEY_LKUPS'中的关键字将输入字符串的单词保留为外连接,这将为输入中的每个单词提供输入,如果它应该是前导或尾随或为空(如果它不存在) )

所以,到目前为止,我们已经(输入"MR Jones MRS COMPANY"):

KEY_WORD    POSITION    WORD    ORD 
----------------------------------- 
MR          LEADING     MR      1  
(null)      (null)      Jones   2  
MRS         LEADING     MRS     3  
COMPANY     TRAILING    COMPANY 4 

现在出现了棘手的部分(也许有更好的方法) - 我们需要以某种方式知道要移除哪个单词,它应该是所有LEADING,直到“更改”,这意味着直到我们找到null或TRAILING(顶部)所有TRAILING直到“一个变化”,这意味着null或LEADING(自下而上)。所以我使用了一个已知的Technic for累积和,只要我们仍然归零,我们需要删除该行(一旦我们得到“改变”,我们就会有一些值)。

就是这样,我们现在需要做的就是将行重新收集到一个新字符串,因为11gr2我们可以完全使用LISTAGG