我要求在pl / sql中对字符串进行标记,并仅返回唯一的标记。我已经看到了将字符串标记化的示例,但没有一个会返回唯一的标记。
例如查询 -
select tokenize('hi you person person', ' ') as col1 from dual;
应该返回TOKEN_LIST('hi','you','person')
而不是TOKEN_LIST('hi','you','person','person')
答案 0 :(得分:6)
with t as (select 'aaaa bbbb cccc dddd eeee ffff aaaa' as txt from dual)
-- end of sample data
select DISTINCT REGEXP_SUBSTR (txt, '[^[:space:]]+', 1, level) as word
from t
connect by level <= length(regexp_replace(txt,'[^[:space:]]+'))+1;
上述脚本产生以下结果:
WORD
dddd
eeee
bbbb
ffff
cccc
aaaa
这个想法是从OTN Community answer无耻地偷走的。