我可以通过编程方式执行此操作,但正在寻找更清晰的解决方案。
假设我有下表:
First Name Last Name
Smith Albert
Smith Alphonse
Smith Jason
Johnson Charles
Roberts Chris
Roberts Christian
我想创建一个具有以下规则的唯一
对于艾伯特史密斯,我会返回 Alb.Smith
对于查尔斯约翰逊,我会回归约翰逊
对于Christion Roberts,我会返回 Christ.Roberts
有没有人对如何在Oracle SQL语句中直接完成此操作有任何想法,还是应该坚持在程序中执行此操作?
答案 0 :(得分:6)
版本recursive subquery refactoring(CTE),需要11gR2:
with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
select last_name, first_name,
row_number() over (order by last_name, first_name),
cast (null as varchar2(20)), 0, length(first_name)
from t42
union all
select last_name, first_name, orig_rn,
part || substr(first_name, part_length + 1, 1),
part_length + 1,
remaining - 1
from t
where remaining > 0
),
u as (
select last_name, first_name, orig_rn, part, part_length,
count(distinct orig_rn) over (partition by last_name) as last_name_count,
count(distinct orig_rn) over (partition by last_name, part) as part_count
from t
),
v as (
select last_name, first_name, orig_rn, part, last_name_count,
row_number() over (partition by orig_rn order by part_length) as rn
from u
where (part_count = 1 or part = first_name)
)
select case when last_name_count = 1 then null
when part = first_name then first_name || ' '
else part || '. '
end || last_name as condendsed_name
from v
where rn = 1
order by orig_rn;
给出了:
CONDENSED_NAME
----------------------------------------------
Johnson
Chris Roberts
Christ. Roberts
Alb. Smith
Alp. Smith
J. Smith
t
CTE是递归的。它从原始表行开始,并为第一个名称的每个可能收缩生成其他行:
with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
select last_name, first_name,
row_number () over (order by last_name, first_name),
cast (null as varchar2(20)), 0, length(first_name)
from t42
union all
select last_name, first_name, orig_rn,
part || substr(first_name, part_length + 1, 1),
part_length + 1,
remaining - 1
from t
where remaining > 0
)
select last_name, first_name, part
from t
where last_name = 'Johnson'
order by orig_rn, part_length;
LAST_NAME FIRST_NAME PART
-------------------- -------------------- ------------------------
Johnson Charles
Johnson Charles C
Johnson Charles Ch
Johnson Charles Cha
Johnson Charles Char
Johnson Charles Charl
Johnson Charles Charle
Johnson Charles Charles
下一个CTE,u
(是的,抱歉这些名字,我没有灵感)比较所有行的值并计算出现次数。计数为1
的任何内容都是唯一的。
...
u as (
select last_name, first_name, orig_rn, part, part_length,
count(distinct orig_rn) over (partition by last_name) as last_name_count,
count(distinct orig_rn) over (partition by last_name, part) as part_count
from t
)
select last_name, first_name, part, last_name_count, part_count
from u
where last_name = 'Roberts'
order by orig_rn, part_length;
LAST_NAME FIRST_NAME PART LAST_NAME_COUNT PART_COUNT
-------------------- -------------------- ------------------------ --------------- ----------
Roberts Chris 2 2
Roberts Chris C 2 2
Roberts Chris Ch 2 2
Roberts Chris Chr 2 2
Roberts Chris Chri 2 2
Roberts Chris Chris 2 2
Roberts Christian 2 2
Roberts Christian C 2 2
Roberts Christian Ch 2 2
Roberts Christian Chr 2 2
Roberts Christian Chri 2 2
Roberts Christian Chris 2 2
Roberts Christian Christ 2 1
Roberts Christian Christi 2 1
Roberts Christian Christia 2 1
Roberts Christian Christian 2 1
第三个CTE v
只查看唯一的CTE,然后根据唯一值的长度对它们进行排名;因此,对于所有记录中唯一的记录的第一个名称的最短收缩被排名为1
。
...
v as (
select last_name, first_name, orig_rn, part, last_name_count,
row_number() over (partition by orig_rn order by part_length) as rn
from u
where (part_count = 1 or part = first_name)
)
select last_name, first_name, part, last_name_count
from v
where rn = 1
order by orig_rn;
LAST_NAME FIRST_NAME PART LAST_NAME_COUNT
-------------------- -------------------- ------------------------ ---------------
Johnson Charles 1
Roberts Chris Chris 2
Roberts Christian Christ 2
Smith Albert Alb 3
Smith Alphonse Alp 3
Smith Jason J 3
然后,最终查询会提取排名为1
的那些,这是最短的唯一值,并按照您想要的方式对其进行格式化。
如果两个人的名字完全相同,则两者都拼写完整(demo),这似乎是您对评论的要求。
不确定这是否真的有资格作为'清洁',只是它只会击中原始表一次。
答案 1 :(得分:3)
试试这个:
with
last_names as (
select last_name, count(*) as last_name_count
from table_name
group by last_name )
select case
when b.last_name_count = 1 then a.last_name
else substr(a.first_name,1,1)||'. '||a.last_name
end as name
from table_name a
join last_names b
on a.last_name = b.last_name;
用正确的名称替换table_name
。