Question

我有两张桌子：

人：

PersonId   Name
--------------------
1          Peter
2          Steven
3          Luck

爱好：

PersonId   Hobbie
--------------------
1          Running
1          Cooking
2          Running
3          Running
3          Cooking

我只需要选择具有相同爱好和相同数量爱好的人

示例结果：

PersonId
--------------
1
3

Answer 1

with hobbies (PersonId, Hobbie) as
(select 1, 'Running' from dual union all
select 1, 'Cooking'  from dual union all
select 2, 'Running' from dual union all
select 3, 'Running'  from dual union all
select 3, 'Cooking'  from dual ),

src as (
    select PersonId,Hobbie,count(Hobbie) over (partition by PersonId) as cnt from hobbies group by PersonId,Hobbie
       )
select distinct s1.PersonId from src s1 join src s2 
on s1.hobbie=s2.hobbie 
and s1.cnt=s2.cnt
and s1.PersonId<>s2.PersonId

您可以尝试这种方法，先计算一下，然后加入计数和爱好

Answer 2

假设您正在使用MySql，您可以通过这种方法实现这一目标。

首先，你聚合同一个人的所有行，计算兴趣爱好并将它们连接起来

select  p.PersonId,
        count(h.Hobbie) cnt,
        GROUP_CONCAT(h.Hobbie SEPARATOR ',') hobbies
from    person p
join    (select * from hobbies order by Hobbie) h
on      p.PersonId = h.PersonId
group by p.PersonId;

请注意，为了确保连接是相同的，您必须从先前排序的hobbies表中选择业余爱好，否则不同的顺序可能会对连接的不同值产生影响。

然后，您可以将此查询的结果用作您自己加入的表，使用作为兴趣的计数和连接的连接条件，但不包括具有相同PersonId的行。 / p>

select  t1.PersonId, t2.PersonId
from    (
        select  p.PersonId,
                count(h.Hobbie) cnt,
                GROUP_CONCAT(h.Hobbie SEPARATOR ',') hobbies
        from    person p
        join    (select * from hobbies order by Hobbie) h
        on      p.PersonId = h.PersonId
        group by p.PersonId
        ) t1
join    (
        select  p.PersonId,
                count(h.Hobbie) cnt,
                GROUP_CONCAT(h.Hobbie SEPARATOR ',') hobbies
        from    person p
        join    (select * from hobbies order by Hobbie) h
        on      p.PersonId = h.PersonId
        group by p.PersonId
        ) t2
on      t1.cnt = t2.cnt and
        t1.hobbies = t2.hobbies and
        t1.PersonId <> t2.PersonId

您可以在行动here

中看到它

Answer 3

这是使用带有相关子查询的HAVING子句的一种方式：

SELECT h1.PersonId AS p1, h2.PersonId AS p2
FROM Hobbies AS h1
INNER JOIN Hobbies AS h2 
   ON h1.Hobbie = h2.Hobbie AND h1.PersonId <> h2.PersonId
GROUP BY p1, p2                           
HAVING COUNT(*) = (SELECT COUNT(*)
                  FROM Hobbies
                  WHERE PersonId = h1.PersonId)
       AND                    
       COUNT(*) = (SELECT COUNT(*)
                  FROM Hobbies
                  WHERE PersonId = h2.PersonId)

该查询挑选所有具有共同爱好的人，这些爱好等于该对中每个人的爱好数量。

Demo here

Answer 4

假设您使用的是支持公用表表达式（CTE）的DBMS，例如SQL Server但不是 MySQL，您可以这样做：

with (
  select PersonId as pid, count(*) as c
  from hobbie
  group by PersonId
) as hobbie_count
with (
  select h1.PersonId as pid1, h2.PersonId as pid2, count(*) as c
  from hobbie h1
    join hobbie h2
      on h1.hobbie = h2.hobbie
  group by h1.PersonId, h2.PersonId
  having h1.PersonId <> h2.PersonId
) as cohob_count
select cc.pid1, cc.pid2
from hobbie_count hc1
  join cohob_count cc
    on hc1.pid = cc.pid1 and hc1.c = cc.c
  join hobbie_count hc2
    on cc.pid2 = hc2.pid and cc.c = hc2.c

hobbie_count CTE计算每个人有多少爱好，cohob_count CTE计算每对人有多少爱好。最后的查询加入那些（第一个两次）以选择共同爱好数量等于每个人的爱好总数的对。

如果您碰巧使用不支持CTE的DBMS（例如MySQL），您仍然可以采用这种通用方法。在这种情况下，您可以将CTE转换为完整的普通视图或内联视图。

如何选择具有相同外部列表（两个表）的记录？

4 个答案: