Oracle Query从名称

时间:2018-06-07 17:21:34

标签: sql oracle

我在一个字段中有全名(姓氏,名字中间名,包括后缀)。这是解析名称的业务规则

。姓氏:从第一个位置到逗号的字符将使用姓氏。 。名字:逗号到下一个空格后的第1个字符将转到第一个名称。 。后缀:一旦获得名字和姓氏,根据值(Jr,Sr,II,III,IV,V)在名称短语的剩余部分中搜索后缀名称 。中间名:将其余字符放入中间名。

  1. LST,FRST MDL SR(她的LST是姓,FRST是名字,MDL是中间名,SR后缀)
  2. LST,FRST SR MDL MDL2(她的LST是LastName,FRST是FirstName," MDL MDL2"是middleName,SR后缀)
    1. LST,FRST MDL1 JR MDL2 MDL3(她的LST是LastName,FRST是FirstName," MDL1 MDL2 MDL3"是middleName,JR后缀)
  3. 含义各种格式的后缀和中间名。是否可以编写查询以提取后缀并将其余部分放入中间名称字段?

    这是我的查询,但我正在寻找更简单的方法,而且我的查询是检查中间名中最多3个空格以查找后缀。

    WITH IDN_NAM AS
    (
     SELECT RECORD_NUMBER, IDN.INDEX1,
      trim(regexp_substr(REPLACE(IDN.IDN_NAM,',',' '),  '[^ ]+',1,2)) AS FIRST_NAME,
      trim(regexp_substr(REPLACE(IDN.IDN_NAM,',',' '),  '[^ ]+',1,1)) AS LAST_NAME,
      CASE WHEN instr(REPLACE(IDN.IDN_NAM,',',' '),' ',1,2)+1 = 1 THEN NULL ELSE substr(REPLACE(IDN.IDN_NAM,',',' '),instr(REPLACE(IDN.IDN_NAM,',',' '),' ',1,2)+1) END AS MID_NAME,
      TRIM(IDN.IDN_NAM) AS FULL_NAME
      FROM 
      (
       SELECT IDN.RECORD_NUMBER, IDN_NAM IDN_NAM_ORIG, replace(replace(replace(REPLACE(IDN.IDN_NAM, ',', ' '),' ','<>'),'><',''),'<>',' ') IDN_NAM , IDN.INDEX1
       FROM SNM_TMP_IDENTITY_IDN_NAM IDN,
            TMP_LEGACY_EVENT T
       WHERE IDN.RECORD_NUMBER = T.RECORD_NUMBER 
        AND NOT REGEXP_LIKE(IDN_NAM,'DLE.+[[:digit:]]')
      ) IDN
    )
    SELECT DECODE(INDEX1, 1, 'T', 'F') MASTER_IND, FIRST_NAME, LAST_NAME,--, MID_NAME, MID_NAM1, MID_NAM2, MID_NAM3, SUF1, SUF2, SUF3,
    CASE WHEN SUF3 IS NULL AND SUF2 IS NULL AND SUF1 IS NULL THEN MID_NAME
        WHEN SUF3 IS NOT NULL THEN NVL(MID_NAM1,'')||CASE WHEN MID_NAM2 IS NOT NULL THEN ' '||MID_NAM2 ELSE '' END
        WHEN SUF2 IS NOT NULL THEN NVL(MID_NAM1,'')||CASE WHEN MID_NAM3 IS NOT NULL THEN ' '||MID_NAM3 ELSE '' END
        WHEN SUF1 IS NOT NULL THEN NVL(MID_NAM2,'')||CASE WHEN MID_NAM3 IS NOT NULL THEN ' '||MID_NAM3 ELSE '' END
    END MIDDLE_NAME,
    NVL(SUF3, NVL(SUF2, SUF1)) NAME_SUFFIX_CD
    FROM
    (
    SELECT NM.*,
    (SELECT MAX(NAME_SUFFIX_CODE) FROM CRRMS_CODED.CODED_NAME_SUFFIX WHERE UPPER(REPLACE(NAME_SUFFIX_CODE,'.', '')) = UPPER(MID_NAM1)) SUF1,
    (SELECT MAX(NAME_SUFFIX_CODE) FROM CRRMS_CODED.CODED_NAME_SUFFIX WHERE UPPER(REPLACE(NAME_SUFFIX_CODE,'.', '')) = UPPER(MID_NAM2)) SUF2,
    (SELECT MAX(NAME_SUFFIX_CODE) FROM CRRMS_CODED.CODED_NAME_SUFFIX WHERE UPPER(REPLACE(NAME_SUFFIX_CODE,'.', '')) = UPPER(MID_NAM3)) SUF3
    FROM
    (
     SELECT I.*,
      trim(regexp_substr(MID_NAME,  '[^ ]+',1,1)) AS MID_NAM1,
      trim(regexp_substr(MID_NAME,  '[^ ]+',1,2)) AS MID_NAM2,
      trim(regexp_substr(MID_NAME,  '[^ ]+',1,3)) AS MID_NAM3
     FROM IDN_NAM I
    ) NM
    )
    ;
    

2 个答案:

答案 0 :(得分:0)

这是一个可能适合该法案的查询,虽然它有点超出问题定义,因为它还处理前缀:

with names(name) as (
            select 'LST, FRST MDL SR' from dual
  union all select 'LST, FRST SR MDL MDL2' from dual
  union all select 'LST, FRST MDL1 JR MDL2 MDL3' from dual
  union all select 'Jones, John Paul' from dual
  union all select 'Jones, Mr. John Jr Paul' from dual
  union all select 'Jones, John Paul Jr Henry' from dual
  union all select 'Henry, John Paul Sr' from dual
  union all select 'Masters, Lee II' from dual
)
select name
     , REGEXP_SUBSTR(name, '([^,]+), ?(((dr|rev|mr|mrs|ms|miss)[.]?) )?([^ ]+) ?(.*)',1,1,'i',3) pfx
     , REGEXP_SUBSTR(name, '([^,]+), ?(((dr|rev|mr|mrs|ms|miss)[.]?) )?([^ ]+) ?(.*)',1,1,'i',5) frst
     , rtrim(REGEXP_REPLACE(
       REGEXP_SUBSTR(name, '([^,]+), ?(((dr|rev|mr|mrs|ms|miss)[.]?) )?([^ ]+) ?(.*)',1,1,'i',6)
       ,'(^|[[:space:]])(jr|sr|ii|iii|iv|v)([[:space:]]|$)','\1',1,1,'i')) mdl
     , REGEXP_SUBSTR(name, '([^,]+), ?(((dr|rev|mr|mrs|ms|miss)[.]?) )?([^ ]+) ?(.*)',1,1,'i',1) Lst
     , REGEXP_SUBSTR(
       REGEXP_SUBSTR(name, '([^,]+), ?(((dr|rev|mr|mrs|ms|miss)[.]?) )?([^ ]+) ?(.*)',1,1,'i',6)
       ,'(^|[[:space:]])(jr|sr|ii|iii|iv|v)([[:space:]]|$)',1,1,'i',2) SFX
from names;

如果您不想要前缀,那么这可能对您有用:

select name
     , REGEXP_SUBSTR(name, '([^,]+), ?([^ ]+) ?(.*)',1,1,'i',2) frst
     , rtrim(REGEXP_REPLACE(
       REGEXP_SUBSTR(name, '([^,]+), ?([^ ]+) ?(.*)',1,1,'i',3)
       ,'(^|[[:space:]])(jr|sr|ii|iii|iv|v)([[:space:]]|$)','\1',1,1,'i')) mdl
     , REGEXP_SUBSTR(name, '([^,]+), ?([^ ]+) ?(.*)',1,1,'i',1) Lst
     , REGEXP_SUBSTR(
       REGEXP_SUBSTR(name, '([^,]+), ?([^ ]+) ?(.*)',1,1,'i',3)
       ,'(^|[[:space:]])(jr|sr|ii|iii|iv|v)([[:space:]]|$)',1,1,'i',2) SFX
from names;

始终使用相同的正则表达式来识别和提取名称的每个部分:

'([^,]+), ?([^ ]+) ?(.*)'
 |     |   |     |  +--+ -> 3) Middle Names (including Suffix)
 |     |   +-----+ -------> 2) First Name
 +-----+ -----------------> 1) Last Name

名字和姓氏都是直截了当的。但是,中间名和后缀需要一些额外的工作来管理。要获得中间名或后缀,需要第二个正则表达式:

'(^|[[:space:]])(jr|sr|ii|iii|iv|v)([[:space:]]|$)'

通过使用上述正则表达式和REGEXP_REPLACE函数,可以删除后缀,只留下中间名。类似地使用与REGEXP_SUBSTR函数相同的正则表达式,可以检索后缀本身。

答案 1 :(得分:0)

DIN,

根据您获取姓氏,名字,后缀和中间名的逻辑,获取3个样本输入并进行查询。对于所有3它工作得很好。

select last_name,first_name,suffix
,trim(regexp_replace(input_string,'(^'||last_name||',|'||first_name||'|'||suffix||')',' ',1,0,'i')) as middle_name
from (
select substr(col,1,instr(col,',',1,1)-1) as last_name ,
trim(substr(col,instr(col,',',1,1)+1,instr(col,' ',1,1))) as first_name,
trim(regexp_substr(col,'(Jr|Sr|II|III|IV|V)',1,1,'i')) as suffix,
col as input_string
from (
select 'LST, FRST MDL1 JR MDL2 MDL3' as col from dual));