用随机文本替换字符串-Oracle SQL

时间:2019-03-23 15:44:37

标签: sql oracle

我有一个表table1,其中有1列-edi_value,类型为CLOB

这些是条目:

seq  edi_message
1    ISA*00*          *00*          *08*9254110060     *ZZ*123456789      *041216*0805*U*00501*000095071*0*P*>~
    GS*AG*5137624388*123456789*20041216*0805*95071*X*005010~
    ST*824*021390001*005010X186A1~

2    ISA*00*          *00*          *08*56789876678     *ZZ*123456789      *041216*0805*U*00501*000095071*0*P*>~
    GS*AG*5137624388*123456789*20041216*0805*95071*X*005010~
    ST*824*021390001*005010X186A1~
  

请注意-行数可以从3到500不等。

我正在寻找以下条件:

  • 忽略每行中前*号之前的文本,对于第一个*前的每一行,它都不应更改。对于前。 GS,ST不应更改。只有第一个*之后才应该随机
  • 用随机数代替数字[0-9],例如。如果将0替换为1,则应为1过渡。
  • 用随机文本替换文本[A-Za-z],例如。如果将A替换为W,则应始终用W替换
  • 保留特殊字符
  

一个字符/数字只能映射到一个随机字符/数字

输出可以是:

seq  edi_message
1    ISA*11*          *11*          *13*4030111101     *QQ*102030234      *101010*1313*U*11311*111143121*1*V*>~
    GS*WE*3122000233*102030234*01101010*1313*43121*X*113111~
    ST*300*101241111*113111X130A1~

2    ISA*11*          *11*          *13*30234320023     *QQ*102030234      *101010*1313*U*11311*111143121*1*V*>~
    GS*WE*3122000233*102030234*01101010*1313*43121*X*113111~
    ST*300*101241111*113111X130W1~

这在Oracle SQL中如何实现?

2 个答案:

答案 0 :(得分:3)

您可以将translate与帮助函数一起使用以生成随机字符串(尽管@LukStorms使用LISTAGG具有much neater SQL solution for that)以及标记方法,然后重新连接该方法值转换成行(我在这里使用纯SQL方法进行演示):

create or replace function f(p_low integer, p_high integer) 
    return varchar as
  r varchar(2000) := '';
  x integer;
begin
  for i in p_low..p_high loop
    x := dbms_random.value(0,length(r)+1);
    r := substr(r,1,x)||chr(i)||substr(r,x+1);
  end loop;
  return r;
end;
/
select * from table1;
| EDI_VALUE                                                                                                                                                                                                        |
| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ISA*00*          *00*          *08*9254110060     *ZZ*123456789      *041216*0805*U*00501*000095071*0*P*>~<br>    GS*AG*5137624388*123456789*20041216*0805*95071*X*005010~<br>    ST*824*021390001*005010X186A1~ |
| ISA*00*          *00*          *08*56789876678     *ZZ*123456789      *041216*0805*U*00501*000095071*0*P*>~<br>    GS*AG*5137624388*123456789*20041216*0805*95071*X*005010~<br>    ST*824*021390001*005010X186A  |
with t as (select f(48,57)||f(65,90) translate_chars from dual)
select (select new_value
        from (select substr(sys_connect_by_path(r_line,'
'),2) new_value, connect_by_isleaf isleaf
              from (select lvl
                         , substr(line,1,instr(line,'*')-1)||
                             translate(substr(line,instr(line,'*'))
                                      ,'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
                                      ,(select translate_chars from t)) r_line
                    from (select level lvl
                               , regexp_substr(edi_value,'^.*$',1,level,'m') line
                          from (select table1.edi_value from dual)
                          connect by level <= regexp_count(edi_value,'^.*$',1,'m')))
              start with lvl=1 connect by lvl=(prior lvl)+1)
        where isleaf=1)
from table1;
| (SELECTNEW_VALUEFROM(SELECTSUBSTR(SYS_CONNECT_BY_PATH(R_LINE,''),2)NEW_VALUE,CONNECT_BY_ISLEAFISLEAFFROM(SELECTLVL,SUBSTR(LINE,1,INSTR(LINE,'*')-1)||TRANSLATE(SUBSTR(LINE,INSTR(LINE,'*')),'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ',(SELECTTRANSLATE_CHARSFR |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ISA*66*          *66*          *67*1935006626     *VV*098532471      *650902*6763*K*66360*666613640*6*P*>~<br>    GS*GZ*3084295877*098532471*96650902*6763*13640*I*663606~<br>    ST*795*690816660*663606I072G0~                                            |
| ISA*66*          *66*          *67*32471742247     *VV*098532471      *650902*6763*K*66360*666613640*6*P*>~<br>    GS*GZ*3084295877*098532471*96650902*6763*13640*I*663606~<br>    ST*795*690816660*663606I072G                                             |

db <>提琴here

答案 1 :(得分:1)

您可以将CTE与CONNECT结合使用来生成字母和数字的字符串。

然后在转换中使用有序和加扰的字符串。

可以使用CROSS APPLY进行正则表达式将邮件分成几部分。
然后仅翻译以*开头的内容。
并使用LISTAGG将部件重新粘合在一起。

WITH 
NUMS as
(
  select 
  LISTAGG(n, '') WITHIN GROUP (ORDER BY n) as n_from,
  LISTAGG(n, '') WITHIN GROUP (ORDER BY DBMS_RANDOM.VALUE) as n_to
  from (select level-1 n from dual connect by level <= 10) 
),
LETTERS as
(
  select 
  LISTAGG(c, '') WITHIN GROUP (ORDER BY c) as c_from,
  LISTAGG(c, '') WITHIN GROUP (ORDER BY DBMS_RANDOM.VALUE) as c_to
  from (select chr(ascii('A')+level-1 ) c from dual connect by level <= 26) 
)
SELECT ca.scrambled as scrambled_message
FROM table1 t
CROSS JOIN NUMS
CROSS JOIN LETTERS
CROSS APPLY 
(
 SELECT LISTAGG(CASE WHEN part like '*%' then translate(part, n_from||c_from, n_to||c_to) else part end, '') WITHIN GROUP (ORDER BY lvl) as scrambled
 FROM
 (
  SELECT 
  level AS lvl,
  REGEXP_SUBSTR(t.edi_message,'[*]\S+|[^*]+',1,level,'m') AS part
  FROM dual
  CONNECT BY level <= regexp_count(t.edi_message, '[*]\S+|[^*]+')+1
 ) parts
) ca;

db <>小提琴here

的测试

示例输出:

SCRAMBLED_MESSAGE
-----------------------------------------------------------------------------------------------------------
ISA*99*          *99*          *92*3525999959     *PP*950525023      *959595*9292*A*99299*999932909*9*J*>~
    GS*WQ*2900555022*950525023*59959595*9292*32909*I*992999~
    ST*255*959039999*992999I925V9~
ISA*99*          *99*          *92*25023205502     *PP*950525023      *959595*9292*A*99299*999932909*9*J*>~
    GS*WQ*2900555022*950525023*59959595*9292*32909*I*992999~
    ST*255*959039999*992999I925W9~