Question

我有一个简单的表格，如下所示：

ClientID    ItemID
1           1
1           2
1           3
2           1
2           2
3           3
4           3
5           1
5           2
5           4
5           5

其中两列组合成为主键。我现在的任务是识别分配给ClientID的所有唯一ItemID集。所以在我的例子中，集合将是：

ItemIDs 1,2,3 (used by ClientID 1)
ItemIDs 1,2 (used by ClientID 2)
ItemIDs 3 (used by ClientIDs 3 and 4)
ItemIDs 1,2,4,5 (used by ClientID 5)

理想情况下，输出将是两个表：

SetID    ItemID
1        1
1        2
1        3
2        1
2        2
3        3
4        1
4        2
4        4
4        5

ClientID    SetID
1           1
2           2
3           3
4           3
5           4

其中SetID将是一个在其他地方使用的新字段。

目前我识别唯一集合的方法包括使用游标为每个ClientID构建一个有序ItemID的字符串，然后比较输出以获取唯一字符串，最后解析回来。它写得很快但感觉很糟糕。

我确信必须有比这更好的方法。有什么想法吗？

Answer 1

-- Table to hold test data
declare @T table
(
  ClientID int,
  ItemID int
)

insert into @T values
(1, 1),(1, 2),(1, 3),
(2, 1),(2, 2),
(3, 3),(4, 3),
(5, 1),(5, 2),(5, 4),(5, 5)


-- Temp table that will hold the generated set's
declare @Tmp table
(
  ClientID int,
  ItemIDSet varchar(max),
  SetID int
)

-- Query the sets using rank() over a comma separated ItemIDSet
insert into @Tmp
select ClientID,
       ItemIDSet,
       rank() over(order by ItemIDSet) as SetID
from (
      select T1.ClientID,
             stuff((select ','+cast(T2.ItemID as varchar(10))
                    from @T as T2
                    where T1.ClientID = T2.ClientID
                    order by T2.ItemID
                    for xml path('')), 1, 1, '') as ItemIDSet
      from @T as T1
      group by T1.ClientID
     ) as T

-- Get ClientID and SetID from @Tmp
select ClientID, 
       SetID
from @Tmp
order by ClientID

-- Get SetID and ItemID from @Tmp
select SetID,
       T3.N.value('.', 'int') as ItemID
from ( 
       select distinct
              SetID,
              '<i>'+replace(ItemIDSet, ',', '</i><i>')+'</i>' as ItemIDSet
       from @Tmp
     ) as T1
  cross apply 
     ( 
       select cast(T1.ItemIDSet as xml) as ItemIDSet
     ) as T2
  cross apply T2.ItemIDSet.nodes('i') as T3(N)

结果：

ClientID    SetID
----------- -----------
1           2
2           1
3           4
4           4
5           3

SetID       ItemID
----------- -----------
1           1
1           2
2           1
2           2
2           3
3           1
3           2
3           4
3           5
4           3

SetID的值与您提供的输出中的值并不完全相同，但我认为这不是一个大问题。 SetID是由ItemIDSet排序的等级函数rank() over(order by ItemIDSet)生成的。

把它换成spin。

识别SQL Server中的链接ID集

1 个答案: