Question

我有一个示例数据：

Johnson; Michael, Surendir;Mishra, Mohan; Ram
Johnson; Michael R.
Mohan; Anaha
Jordan; Michael
Maru; Tushar

查询的输出应为：

Johnson; Michael   2
Mohan; Anaha       1
Michael; Jordon    1
Maru; Tushar       1
Surendir;Mishra    1
Mohan; Ram         1

正如您所看到的那样，它打印的是每个名称的分数，但是有一个扭曲。我们不能简单地用全名做一个groupby，因为有时候这个名字可能包含第一个名字的中间名，有时也可能没有。例如。 Johnson; Michael和Johnson; Michael R.被视为单个名称，因此其计数为2.此外，Johnson; Michael应显示或Johnson; Michael R.应出现在结果集中且计数为2（不是两者都是因为那将是重复的记录）

该表包含由其分隔的名称，并且不可能将其归一化，因为它是LIVE并由其他人提供给我们。

无论如何不使用游标为此编写查询？我的数据库中有大约300万条记录，我也必须支持分页等。您认为实现这一目标的最佳途径是什么？

Answer 1

这就是为什么您的数据应该正常化。

;with cte as  
( 
    select 1 as Item, 1 as Start, CHARINDEX(',',People+',' , 1) as Split, 
           People+',' as People 
    from YourHorribleTable
    union all 
    select cte.Item+1, cte.Split+1, nullif(CHARINDEX(',',people, cte.Split+1),0), People as Split 
    from cte 
    where cte.Split<>0   
)    
select Person, COUNT(*)
from
(
select case when nullif(charindex (' ', person, 2+nullif(CHARINDEX(';', person),0)),0) is null then person  
    else substring(person,1,charindex (' ', person, 2+nullif(CHARINDEX(';', person),0)))
    end as Person
from
(
select LTRIM(RTRIM( SUBSTRING(people, start,isnull(split,len(People)+1)-start))) as person
from cte  
) v
where person<>''
) v
group by Person
order by COUNT(*) desc

除了光标在这种组中的任何选项吗？

1 个答案: