我需要从输入字符串中删除某些关键字并返回新字符串。关键字存储在另一个表中,如MR,MRS,DR,PVT,PRIVATE,CO,COMPANY,LTD,LIMITED等。它们是两种关键词LEADING - MR,MRS,DR和TRAILING - PVT,PRIVATE,CO,COMPANY,如果关键字是LEADING那么我们必须从头开始删除它,如果它是TRAILING那么我们必须从最后删除它。例如,MR Jones MRS COMPANY
应该返回JONES MRS
而MR MRS Jones PVT COMPANY
应该返回JONES
(如在第一次迭代中MR
和PVT
将被修剪,然后变为MRS JONES PVT
,在第二次迭代中,它将变为JONES
。同样MR MRS Doe PVT COMPANY LTD
将最终返回DOE
。
我必须通过PL / SQL来完成。我编写了以下代码,但如果在开头或结尾有多个关键字,则删除所有关键字。我循环关键字光标的原因,如果关键字不在最后并且循环已经迭代,那么我们就不能重复使用该关键字来替换。请注意,在结尾或开头可能没有关键字:
CREATE OR REPLACE FUNCTION replace_keyword (p_in_name IN VARCHAR2)
RETURN VARCHAR2
IS
l_name VARCHAR2 (4000);
CURSOR c
IS
SELECT *
FROM RSRV_KEY_LKUPS
WHERE ACTIVE = 'Y';
BEGIN
l_name := TRIM (p_in_name);
--Now inside the function we’ll loop through this cursor something like below and replace the value in the input name:
FOR rec IN c
LOOP
IF UPPER (rec.POSITION) = 'LEADING'
AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) > 0
THEN --Rule 3:remove leading name
DBMS_OUTPUT.PUT_LINE ('Value >>' || rec.KEY_WORD);
l_name := LTRIM (UPPER (l_name), rec.KEY_WORD || ' ');
ELSIF UPPER (rec.POSITION) = 'TRAILING'
AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) > 0
THEN --Rule 4:remove trailing name
DBMS_OUTPUT.PUT_LINE ('Value >>' || rec.KEY_WORD);
l_name := RTRIM (UPPER (l_name), ' ' || rec.KEY_WORD);
END IF;
l_name := l_name;
END LOOP;
l_name := REGEXP_REPLACE (l_name, '[[:space:]]{2,}', ' '); --Remove multiple spaces in a word and replace with single blank space
l_name := TRIM (l_name); --Remove the leading and trailing blank spaces
RETURN l_name;
EXCEPTION
WHEN OTHERS
THEN
raise_application_error (
-20001,
'An error was encountered - ' || SQLCODE || ' -ERROR- ' || SQLERRM);
END;
/
非常感谢您提供任何帮助。
修改 样本输入1
MR MRS Jones PVT COMPANY
输出
JONES
示例输入2
MR MRS Doe PVT COMPANY LTD
输出
DOE
答案 0 :(得分:1)
如果您想确定在开头找到了主要关键字,则只应在INSTR返回1时将其删除:
替换
IF UPPER (rec.POSITION) = 'LEADING'
AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) > 0
与
IF UPPER (rec.POSITION) = 'LEADING'
AND INSTR (UPPER (l_name), UPPER (rec.KEY_WORD || ' '), 1) = 1
并替换
ELSIF UPPER (rec.POSITION) = 'TRAILING'
AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) > 0
通过
ELSIF UPPER (rec.POSITION) = 'TRAILING'
AND INSTR (UPPER (l_name), UPPER (' ' || rec.KEY_WORD), -1) = (LENGTH(l_name)-LENGTH(rec.key_word) +1)
对于多个关键字的问题,你必须绕过for循环:
keyword_found BOOLEAN;
LOOP
keyword_found = false;
FOR rec IN c
-- when you find a keyword
keyword_found := true;
END LOOP;
EXIT WHEN NOT(keyword_found);
END LOOP;
答案 1 :(得分:1)
我认为可以使用单个查询完成(如果你出于某种原因,可以将其包装在plsql函数中):
with inpt as (select 'MR Jones MRS COMPANY' text from dual)
select listagg(t1.word, ' ') within group (order by ord) new_text
from (
select w.*, words.*,
sum(case when nvl(POSITION, 'TRAILING') = 'TRAILING' then 1 else 0 end) over(order by ord rows between unbounded preceding and current row) l,
sum(case when nvl(POSITION, 'LEADING') = 'LEADING' then 1 else 0 end) over(order by ord desc rows between unbounded preceding and current row) t
from
(select regexp_substr(inpt.text, '[^ ]+',1,level) word , level ord
from inpt
connect by level <= regexp_count(inpt.text, ' ') + 1) words left outer join RSRV_KEY_LKUPS w on w.KEY_WORD = words.word
) t1
where t1.t > 0 and t1.l > 0
修改:解释:
'with'子句只是将输入字符串作为列(不是必需的)。
将别名称为“单词”的内部选择是用于将单词拆分为行的已知技术(请注意,我保留了ord
列的顺序)。
现在我们可以使用表'RSRV_KEY_LKUPS'中的关键字将输入字符串的单词保留为外连接,这将为输入中的每个单词提供输入,如果它应该是前导或尾随或为空(如果它不存在) )
所以,到目前为止,我们已经(输入"MR Jones MRS COMPANY"
):
KEY_WORD POSITION WORD ORD
-----------------------------------
MR LEADING MR 1
(null) (null) Jones 2
MRS LEADING MRS 3
COMPANY TRAILING COMPANY 4
现在出现了棘手的部分(也许有更好的方法) - 我们需要以某种方式知道要移除哪个单词,它应该是所有LEADING,直到“更改”,这意味着直到我们找到null或TRAILING(顶部)所有TRAILING直到“一个变化”,这意味着null或LEADING(自下而上)。所以我使用了一个已知的Technic for累积和,只要我们仍然归零,我们需要删除该行(一旦我们得到“改变”,我们就会有一些值)。
就是这样,我们现在需要做的就是将行重新收集到一个新字符串,因为11gr2我们可以完全使用LISTAGG