在Oracle中创建名字的唯一字符串

时间:2014-01-22 15:02:10

标签: oracle plsql

我可以通过编程方式执行此操作,但正在寻找更清晰的解决方案。

假设我有下表:

First Name      Last Name
Smith           Albert       
Smith           Alphonse    
Smith           Jason         
Johnson         Charles
Roberts         Chris
Roberts         Christian

我想创建一个具有以下规则的唯一

  • 如果姓氏已经是唯一的,则只返回姓氏
  • 如果相同的姓氏首先返回首字母(或更多),然后是句点,则返回姓氏

对于艾伯特史密斯,我会返回 Alb.Smith
对于查尔斯约翰逊,我会回归约翰逊
对于Christion Roberts,我会返回 Christ.Roberts

有没有人对如何在Oracle SQL语句中直接完成此操作有任何想法,还是应该坚持在程序中执行此操作?

2 个答案:

答案 0 :(得分:6)

版本recursive subquery refactoring(CTE),需要11gR2:

with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
  select last_name, first_name,
    row_number() over (order by last_name, first_name),
    cast (null as varchar2(20)), 0, length(first_name)
  from t42
  union all
  select last_name, first_name, orig_rn,
    part || substr(first_name, part_length + 1, 1),
    part_length + 1,
    remaining - 1
  from t
  where remaining > 0
),
u as (
  select last_name, first_name, orig_rn, part, part_length,
    count(distinct orig_rn) over (partition by last_name) as last_name_count,
    count(distinct orig_rn) over (partition by last_name, part) as part_count
  from t
),
v as (
  select last_name, first_name, orig_rn, part, last_name_count,
  row_number() over (partition by orig_rn order by part_length) as rn
  from u
  where (part_count = 1 or part = first_name)
)
select case when last_name_count = 1 then null
  when part = first_name then first_name || ' '
  else part || '. '
  end || last_name as condendsed_name
from v
where rn = 1
order by orig_rn;

给出了:

CONDENSED_NAME                               
----------------------------------------------
Johnson                                        
Chris Roberts                                  
Christ. Roberts                                
Alb. Smith                                     
Alp. Smith                                     
J. Smith                                       

SQL Fiddle

t CTE是递归的。它从原始表行开始,并为第一个名称的每个可能收缩生成其他行:

with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
  select last_name, first_name,
    row_number () over (order by last_name, first_name),
    cast (null as varchar2(20)), 0, length(first_name)
  from t42
  union all
  select last_name, first_name, orig_rn,
    part || substr(first_name, part_length + 1, 1),
    part_length + 1,
    remaining - 1
  from t
  where remaining > 0
)
select last_name, first_name, part
from t
where last_name = 'Johnson'
order by orig_rn, part_length;

LAST_NAME            FIRST_NAME           PART                   
-------------------- -------------------- ------------------------
Johnson              Charles                                       
Johnson              Charles              C                        
Johnson              Charles              Ch                       
Johnson              Charles              Cha                      
Johnson              Charles              Char                     
Johnson              Charles              Charl                    
Johnson              Charles              Charle                   
Johnson              Charles              Charles                  

下一个CTE,u(是的,抱歉这些名字,我没有灵感)比较所有行的值并计算出现次数。计数为1的任何内容都是唯一的。

...
u as (
  select last_name, first_name, orig_rn, part, part_length,
    count(distinct orig_rn) over (partition by last_name) as last_name_count,
    count(distinct orig_rn) over (partition by last_name, part) as part_count
  from t
)
select last_name, first_name, part, last_name_count, part_count
from u
where last_name = 'Roberts'
order by orig_rn, part_length;

LAST_NAME            FIRST_NAME           PART                     LAST_NAME_COUNT PART_COUNT
-------------------- -------------------- ------------------------ --------------- ----------
Roberts              Chris                                                       2          2 
Roberts              Chris                C                                      2          2 
Roberts              Chris                Ch                                     2          2 
Roberts              Chris                Chr                                    2          2 
Roberts              Chris                Chri                                   2          2 
Roberts              Chris                Chris                                  2          2 
Roberts              Christian                                                   2          2 
Roberts              Christian            C                                      2          2 
Roberts              Christian            Ch                                     2          2 
Roberts              Christian            Chr                                    2          2 
Roberts              Christian            Chri                                   2          2 
Roberts              Christian            Chris                                  2          2 
Roberts              Christian            Christ                                 2          1 
Roberts              Christian            Christi                                2          1 
Roberts              Christian            Christia                               2          1 
Roberts              Christian            Christian                              2          1 

第三个CTE v只查看唯一的CTE,然后根据唯一值的长度对它们进行排名;因此,对于所有记录中唯一的记录的第一个名称的最短收缩被排名为1

...
v as (
  select last_name, first_name, orig_rn, part, last_name_count,
  row_number() over (partition by orig_rn order by part_length) as rn
  from u
  where (part_count = 1 or part = first_name)
)
select last_name, first_name, part, last_name_count
from v
where rn = 1
order by orig_rn;

LAST_NAME            FIRST_NAME           PART                     LAST_NAME_COUNT
-------------------- -------------------- ------------------------ ---------------
Johnson              Charles                                                     1 
Roberts              Chris                Chris                                  2 
Roberts              Christian            Christ                                 2 
Smith                Albert               Alb                                    3 
Smith                Alphonse             Alp                                    3 
Smith                Jason                J                                      3 

然后,最终查询会提取排名为1的那些,这是最短的唯一值,并按照您想要的方式对其进行格式化。

如果两个人的名字完全相同,则两者都拼写完整(demo),这似乎是您对评论的要求。

不确定这是否真的有资格作为'清洁',只是它只会击中原始表一次。

答案 1 :(得分:3)

试试这个:

with
last_names as (
  select last_name, count(*) as last_name_count 
  from table_name 
  group by last_name )

select case 
         when b.last_name_count = 1 then a.last_name 
         else substr(a.first_name,1,1)||'. '||a.last_name 
       end as name
from table_name a
join last_names b
on a.last_name = b.last_name;

用正确的名称替换table_name