Question

我的问题只显示分组的唯一数据集的ID。一个简单的例子将是最好的：

| id | color |
--------------
| 1  | red   |
--------------
| 1  | green |
--------------
| 1  | blue  |
--------------
| 2  | red   |
--------------
| 2  | green |
--------------
| 2  | blue  |
--------------
| 3  | red   |
--------------
| 3  | blue  |
--------------
| 3  | yellow|
--------------
| 3  | purple|
--------------

Id 1和id 2具有相同的数据子集（红色，绿色，蓝色），因此结果表应仅包含1 OR 2：

| id |
------
| 1  |
------
| 3  | 
------

我想这个相对基本的问题被多次询问，但我无法确定会产生结果的具体关键词。

Answer 1

虽然SQLite有group_concat()，但在这里没有用，因为连接元素的顺序是任意的。这是最简单的方法。

相反，我们必须考虑这种关系。我们的想法是做到以下几点：

计算两个ID共有的颜色数
计算每个ID上的颜色数
选择这三个值相等的ID对
按对中的最小ID识别每对

然后，最小值的不同值是您想要的列表。

以下查询采用这种方法：

select distinct MIN(id2)
from (select t1.id as id1, t2.id as id2, count(*) as cnt
      from t t1 join
           t t2
           on t1.color = t2.color
      group by t1.id, t2.id
     ) t1t2 join
     (select t.id, COUNT(*) as cnt
      from t
      group by t.id
     ) t1sum
     on t1t2.id1 = t1sum.id and t1sum.cnt = t1t2.cnt join
     (select t.id, COUNT(*) as cnt
      from t
      group by t.id
     ) t2sum
     on t1t2.id2 = t2sum.id and t2sum.cnt = t1t2.cnt
group by t1t2.id1, t1t2.cnt, t1sum.cnt, t2sum.cnt

我实际上是通过在前面放置这个with子句在SQL Server中对此进行了测试：

with t as (
      select 1 as id, 'r' as color union all
      select 1, 'g' union all
      select 1, 'b' union all
      select 2 as id, 'r' as color union all
      select 2, 'g' union all
      select 2, 'b' union all
      select 3, 'r' union all
      select 4, 'y' union all
      select 4, 'p' union all
      select 5 as id, 'r' as color union all
      select 5, 'g' union all
      select 5, 'b' union all
      select 5, 'p'
     )

Answer 2

SQL是面向集合的，所以让我们试试这个：

唯一ID是不存在具有相同颜色集的其他ID的ID。

要确定两个ID是否具有相同的颜色集，我们彼此subtract（这是EXCEPT所做的）并测试结果在两个方向上是否为空：

SELECT id
FROM (SELECT DISTINCT id FROM t) AS t1
WHERE NOT EXISTS (SELECT id FROM (SELECT DISTINCT id FROM t) AS t2
                  WHERE t2.id < t1.id
                    AND NOT EXISTS (SELECT color FROM t WHERE id = t1.id
                                    EXCEPT
                                    SELECT color FROM t WHERE id = t2.id)
                    AND NOT EXISTS (SELECT color FROM t WHERE id = t2.id
                                    EXCEPT
                                    SELECT color FROM t WHERE id = t1.id));

SQL Fiddle

在分组的唯一数据集上选择id

2 个答案: