Question

我在SQL Server 2008中有一个包含数百万行数据的表。我正在尝试寻找替代方法而不是使用distinct。请参阅以下查询：

create table #temp (id int)
create table #temp2 (id int, name varchar(55), t_id int)

insert into #temp values (1)
insert into #temp2 values (1,'john',1)
insert into #temp2 values (2,'alex',1)
insert into #temp2 values (3,'alex',1)

select t.id, t2.name
from #temp t
inner join #temp2 t2 on t.id = t2.t_id

此查询返回如下输出：

Id  Name
1   john
1   alex
1   alex

预期输出为：

Id Name
1  john
1  alex

我可以使用DISTINCT关键字提供预期的输出，我知道，但它会降低性能。你能否告诉我一些专业的替代方法（除了使用group by）来处理它？谢谢！

编辑：我有一个自定义的concentenate功能，可以帮助我做到：

select t.id, concetenate(t2.name)
from #temp t
inner join #temp2 t2 on t.id = t2.t_id

这将返回1 john,alex,alex。我正在寻找一种摆脱alex之一而不更新功能的方法，并且不想使用＆＃34; distinct＆＃34;关键字。

Answer 1

使用GROUP BY

select t.id, t2.name
from #temp t
inner join #temp2 t2 on t.id = t2.t_id
GROUP BY t.id, t2.name

使用CTE和ROW_NUMBER以及您的自定义＆＃34; concetenate＆＃34;功能

;WITH cte
AS(
    select t.id, t2.name, RN=ROW_NUMBER()OVER(PARTITION BY t2.name ORDER BY t2.id)
    from #temp t
    inner join #temp2 t2 on t.id = t2.t_id
)
SELECT C.id
     , Name =concetenate(C.name)
FROM cte C WHERE C.RN = 1

Answer 2

您可以使用下面的分组 - 但是为什么要在yourtable中插入重复项...

create table #temp (id int)
create table #temp2 (id int, name varchar(55), t_id int)

insert into #temp values (1)
insert into #temp2 values (1,'john',1)
insert into #temp2 values (2,'alex',1)
insert into #temp2 values (3,'alex',1)

select t.id, t2.name
from #temp t
inner join #temp2 t2 on t.id = t2.t_id
GROUP BY t.id, t2.name

另一种解决方案是我们可以创建约束来限制重复值。

在SQL Server 2008中使用DISTINCT的替代方法

2 个答案: