Question

请帮助我，我需要找到一个SQL解决方案，用于使用SQL Server数据库对数据进行分组。我很确定它可以在一个SQL请求中完成但我无法看到这个技巧。

让＆＃39;看到问题：我有一个两列表（请参见下面的示例）。我只想添加一个包含数字的新列或表示该组的字符串

之前：

Col1 | Col2
-----+-----
A    | B
B    | C
D    | E
F    | G
G    | H
I    | I
J    | U

转型后：

Col1 | Col2 | Group
-----+------+------
A    | B    | 1
B    | C    | 1
D    | E    | 2
F    | G    | 3
G    | H    | 3
I    | I    | 4
J    | U    | 5

换句话说：A，B，C属于同一组; D和E也是;第3组中的F，G，H ......

Answer 1

您是否有任何查找表来获取此组映射？或者如果您只是定义了一个逻辑来决定一个组，我建议添加一个UDF，它将返回所提供值的组。

SELECT Col1,Col2,GetGroupID(Col1,Col2) AS Group
FROM Table

您的UDF将类似于

CREATE FUNCTION GetGroupID
(
    -- Add the parameters for the function here
    @Col1 varchar(10),
    @Col2 varchar(10)
)
RETURNS int
AS
BEGIN
      DECLARE @groupID int

      IF (@Col1="A" AND @Co2 = "B") OR (@Col1="B" AND @Co2 = "C")
      BEGIN
         SET @groupID = 1
      END
      IF @Col1="D" AND @Co2 = "E"
      BEGIN
         SET @groupID = 2
      END
       -- You can write saveral conditions in the same manner.
    return @groupID
END

但是，如果您在另一个表中的某处定义了此映射，请告诉我们表的结构，然后我们可以更新查询以与该表连接，而不是使用UDF。

考虑到查询的性能，如果表中的数据量很大，建议将这些映射到一个修复表并在查询中加入该表。如果数据量很大，使用UDF可能会损害性能。

Answer 2

这里绝对不需要UDF。无论您是要使用新列更新表还是仅应用分组来提取数据，最好使用基于集合的解决方案，即：创建并加入表。

我假设您没有杂乱的数据，例如Col1 = 'A'和Col2 = 'F'的行。

如果您能够永久添加新表，则可以使用以下命令创建查找表：

create table Col1Groups(Col1 nvarchar(10), GroupNum int);
insert into Col1Groups(Col1,GroupNum) values ('A',1),('B',1),('C',1),('D',2),('E',2),('F',3),('G',3),('H',3);

然后join到它：

select t.Col1
      ,t.Col2
      ,g.GroupNum
from Table t
    inner join Col1Groups g
        on t.Col1 = g.Col1

如果不能，您可以通过CTE：

创建派生表

with Col1Groups as
(
    select Col1
          ,GroupNum
    from (values('A',1),('B',1),('C',1),('D',2),('E',2),('F',3),('G',3),('H',3)) as x(Col1,GroupNum)
)
select t.Col1
      ,t.Col2
      ,g.GroupNum
from Table t
    inner join Col1Groups g
        on t.Col1 = g.Col1

Answer 3

使用

获得每组的第一行

select col1, col2 from mytable where col1 not in (select col2 from mytable) or col1 = col2;

我们可以用

给这些行号

rank() over (order by col1) as grp

现在我们必须遍历行以查找属于那些行的那些行，然后查找属于这些行的那些行等。递归查询。

with cte(col1, col2, grp) as 
(
  select col1, col2, rank() over (order by col1) as grp
  from mytable where col1 not in (select col2 from mytable) or col1 = col2
  union all
  select mytable.col1, mytable.col2, cte.grp
  from cte
  join mytable on mytable.col1 = cte.col2
  where mytable.col1 <> mytable.col2
)
select * from cte
order by grp, col1;

Answer 4

更灵活方法的补充答案

如果你想更进一步，并引入类似网络的关系，如A | B - ＆gt; B | C，D | C，我们不能再向前追踪链（在示例中D属于A组，因为虽然A不直接导致D，但它导致C和D也导致C这是一种解决这个问题的方法：

从表中获取所有字母（无论是在col1还是col2中）。然后为每个人找到相关的字母（无论是在col1还是col2中）。并为这些再次找到相关的字母等。这将为您提供完整的小组。但重复（因为D在A组中，A也在D组中），您可以通过简单地每个字母取最小（或最大）组密钥来摆脱它。然后将这些组加入到表中。

查询：

with cte(col, grp) as 
(
  select col, rownum as grp from 
    (select col1 as col from mytable union select col2 from mytable)
  union all
  select case when mytable.col1 = cte.col then mytable.col2 else mytable.col1 end, cte.grp
  from cte
  join mytable on cte.col in (mytable.col1, mytable.col2) 
  where mytable.col1 <> mytable.col2
)
cycle col set is_cycle to 'y' default 'n'
select mytable.col1, mytable.col2, x.grp 
from mytable
join (select col, min(grp) as grp from cte group by col) x on x.col = mytable.col1
order by grp, col;

SQL组数据（查找数据系列）

4 个答案: