T-SQL - 获取具有相同B组的所有As的列表

时间:2015-06-06 07:40:57

标签: sql tsql

我正在努力处理我正在尝试编写的棘手的SQL查询。看看下表:

+---+---+
| A | B |
+---+---+
| 1 | 2 |
| 1 | 3 |
| 2 | 2 |
| 2 | 3 |
| 2 | 4 |
| 3 | 2 |
| 3 | 3 |
| 4 | 2 |
| 4 | 3 |
| 4 | 4 |
+---+---+

现在,从这个表中,我基本上想要一个包含完全相同B组的所有As的列表,并为每个集合提供递增ID。

因此,上述输出设置为:

+---+----+
| A | ID |
+---+----+
| 1 |  1 |
| 3 |  1 |
| 2 |  2 |
| 4 |  2 |
+---+----+

感谢。

编辑:如果有帮助,我会列出另一个表中可能出现的B的所有不同值。

编辑:非常感谢所有创新的答案。能够学到很多东西。

6 个答案:

答案 0 :(得分:5)

这是解决棘手选择的数学技巧:

with pow as(select *, b * power(10, row_number() 
              over(partition by a order by b)) as rn from t)
select a, dense_rank() over( order by sum(rn)) as rn 
from pow
group by a
order by rn, a

小提琴http://sqlfiddle.com/#!3/6b98d/11

这当然只适用于有限的不同计数,因为你会溢出。以下是使用字符串的更通用的解决方案:

select a, 
dense_rank() over(order by (select  '.' + cast(b as varchar(max))
                            from t t2 where t1.a = t2.a
                            order by b
                            for xml path(''))) rn
from t t1
group by a
order by rn, a

小提琴http://sqlfiddle.com/#!3/6b98d/29

答案 1 :(得分:3)

这样的事情:

select a, dense_rank() over (order by g) as id_b
from (
  select a,
  (select b from MyTable s where s.a=a.a order by b FOR XML PATH('')) g
  from MyTable a
  group by a
) a
order by id_b,a

或者可能使用CTE(我尽可能避免使用它们)

Sql Fiddle

作为旁注,这是使用问题中的示例数据的内部查询的输出:

a   g
1   <b>2</b><b>3</b>
2   <b>2</b><b>3</b><b>4</b>
3   <b>2</b><b>3</b>
4   <b>2</b><b>3</b><b>4</b>

答案 2 :(得分:2)

修改 我正在改变代码,但它现在会变得更大,从中得到帮助 Concatenate many rows into a single text string?用于合并字符串

Select [A],
   Left(M.[C],Len(M.[C])-1) As [D] into #tempSomeTable
From
(
    Select distinct T2.[A], 
        (
            Select Cast(T1.[B] as VARCHAR) + ',' AS [text()]
            From sometable T1
            Where T1.[A] = T2.[A]
            ORDER BY T1.[A]
            For XML PATH ('')
        ) [C]
    From sometable T2
 )M

  SELECT t.A, DENSE_RANK() OVER(ORDER BY t.[D]) [ID] FROM
  #tempSomeTable t
   inner join
  (SELECT [D] FROM(
  SELECT [D], COUNT([A]) [D_A] from 
   #tempSomeTable t
  GROUP BY [D] )P where [C_A]>1)t1 on t1.[D]=t.[D]

答案 3 :(得分:2)

这是一个冗长的方法,通过查找具有相同元素的集合(使用EXCEPT双向消除,并且刚刚完成半对角笛卡尔积),然后配对相等的设置,用{{标记每对1}},在将ROW_NUMBER()对解开到最终输出之前,将等效集投影为具有相同A's的行。

id

SqlFiddle here

目前,这个解决方案只适用于成对的集合,而不是三元组等。一般的WITH joinedSets AS ( SELECT t1.A as t1A, t2.A AS t2A FROM MyTable t1 INNER JOIN MyTable t2 ON t1.B = t2.B AND t1.A < t2.A ), equalSets AS ( SELECT js.t1A, js.t2A, ROW_NUMBER() OVER (ORDER BY js.t1A) AS Id FROM joinedSets js GROUP BY js.t1A, js.t2A HAVING NOT EXISTS ((SELECT mt.B FROM MyTable mt WHERE mt.A = js.t1A) EXCEPT (SELECT mt.B FROM MyTable mt WHERE mt.A = js.t2A)) AND NOT EXISTS ((SELECT mt.B FROM MyTable mt WHERE mt.A = js.t2A) EXCEPT (SELECT mt.B FROM MyTable mt WHERE mt.A = js.t1A)) ) SELECT A, Id FROM equalSets UNPIVOT ( A FOR ACol in (t1A, t2A) ) unp; 类型解决方案可能是可行的(但现在超出了我的大脑)。

答案 4 :(得分:2)

这是一个非常简单,快速但近似的解决方案。 CHECKSUM_AGG可能会为不同的B集返回相同的校验和。

DECLARE @T TABLE (A int, B int);

INSERT INTO @T VALUES
(1, 2),(1, 3),(2, 2),(2, 3),(2, 4),(3, 2),(3, 3),(4, 2),(4, 3),(4, 4);

SELECT
    A
    ,CHECKSUM_AGG(B) AS CheckSumB
    ,ROW_NUMBER() OVER (PARTITION BY CHECKSUM_AGG(B) ORDER BY A) AS GroupNumber
FROM @T
GROUP BY A
ORDER BY A, GroupNumber;

结果集

A    CheckSumB    GroupNumber
-----------------------------
1    1            1
2    5            1
3    1            2
4    5            2

对于A的精确解决方案组,并使用FOR XML,CLR或T-SQL函数将所有B值连接成一个长(二进制)字符串。然后,您可以通过该连接字符串对ROW_NUMBER进行分区,以便为组分配编号。如其他答案所示。

答案 5 :(得分:0)

这是一个精确而非近似的解决方案。它使用的不比INNER JOIN和GROUP BY更高级(当然还有DENSE_RANK()来获取你想要的ID。)

它也是通用的,因为它允许在A组内重复B值。

SELECT   A,
         DENSE_RANK() OVER (ORDER BY MIN_EQUIVALENT_A) AS ID

FROM     (
          SELECT   MATCHES.A1 AS A,
                   MIN(MATCHES.A2) AS MIN_EQUIVALENT_A

          FROM     (
                    SELECT   T1.A AS A1,
                             T2.A AS A2,
                             COUNT(*) AS NUM_B_VALS_MATCHED

                    FROM     (
                              SELECT   A,
                                       B,
                                       COUNT(*) AS B_VAL_FREQ
                              FROM     MyTable
                              GROUP BY A,
                                       B
                             ) AS T1

                             INNER JOIN

                             (
                              SELECT   A,
                                       B,
                                       COUNT(*) AS B_VAL_FREQ
                              FROM     MyTable
                              GROUP BY A,
                                       B
                             ) AS T2

                             ON T1.B = T2.B
                                AND T1.B_VAL_FREQ = T2.B_VAL_FREQ

                    GROUP BY T1.A,
                             T2.A
                   ) AS MATCHES

                   INNER JOIN

                   (
                    SELECT   A,
                             COUNT(DISTINCT B) AS NUM_B_VALS_TOTAL
                    FROM     MyTable
                    GROUP BY A
                   ) AS CHECK_TOTALS_A1

                   ON MATCHES.A1 = CHECK_TOTALS_A1.A
                      AND MATCHES.NUM_B_VALS_MATCHED
                          = CHECK_TOTALS_A1.NUM_B_VALS_TOTAL

                   INNER JOIN

                   (
                    SELECT   A,
                             COUNT(DISTINCT B) AS NUM_B_VALS_TOTAL
                    FROM     MyTable
                    GROUP BY A
                   ) AS CHECK_TOTALS_A2

                   ON MATCHES.A2 = CHECK_TOTALS_A2.A
                      AND MATCHES.NUM_B_VALS_MATCHED
                          = CHECK_TOTALS_A2.NUM_B_VALS_TOTAL

          GROUP BY MATCHES.A1
         ) AS EQUIVALENCE_TABLE

ORDER BY 2,1
;