查找一列中具有相同值的行和另一列中的其他值?

时间:2015-01-04 01:41:24

标签: sql postgresql relational-division

我有一个PostgreSQL数据库,用于将用户存储在users表中,以及他们参与conversation表的对话。由于每个用户都可以参加多个对话,每个对话都可能涉及多个用户,因此我有一个conversation_user个关联表来跟踪哪些用户参与每个对话:

# conversation_user
id  |  conversation_id | user_id
----+------------------+--------
1   |                1 |      32
2   |                1 |       3
3   |                2 |      32
4   |                2 |       3
5   |                2 |       4

在上表中,用户32只与用户3进行一次对话,另一次与3和用户4进行对话。如何编写一个查询,表明只有用户32和用户3之间有对话? / p>

我尝试过以下方法:

SELECT conversation_id AS cid,
       user_id
FROM conversation_user
GROUP BY cid HAVING count(*) = 2
AND (user_id = 32
     OR user_id = 3);

SELECT conversation_id AS cid,
   user_id
FROM conversation_user
GROUP BY (cid HAVING count(*) = 2
AND (user_id = 32
     OR user_id = 3));

SELECT conversation_id AS cid,
       user_id
FROM conversation_user
WHERE (user_id = 32)
  OR (user_id = 3)
GROUP BY cid HAVING count(*) = 2;

这些查询会抛出一个错误,指出user_id必须出现在GROUP BY子句中,或者在聚合函数中使用。将它们置于聚合函数(例如MINMAX)听起来不合适。我认为我的前两次尝试是将它们放在GROUP BY条款中。

我做错了什么?

4 个答案:

答案 0 :(得分:4)

这是关系划分的情况。我们在这个相关问题下汇集了一系列技术:

特殊困难是排除其他用户。基本上有4种技术。

我建议LEFT JOIN / IS NULL

SELECT cu1.conversation_id
FROM        conversation_user cu1
JOIN        conversation_user cu2 USING (conversation_id)
LEFT   JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id
                                 AND cu3.user_id NOT IN (3,32)
WHERE  cu1.user_id = 32
AND    cu2.user_id = 3
AND    cu3.conversation_id IS NULL;

NOT EXISTS

SELECT cu1.conversation_id
FROM   conversation_user cu1
JOIN   conversation_user cu2 USING (conversation_id)
WHERE  cu1.user_id = 32
AND    cu2.user_id = 3
AND NOT EXISTS (
   SELECT 1
   FROM   conversation_user cu3
   WHERE  cu3.conversation_id = cu1.conversation_id
   AND    cu3.user_id NOT IN (3,32)
   );

两个查询都取决于UNIQUE的{​​{1}}约束,该约束可能存在也可能不存在。这意味着,如果同一对话中多次列出(conversation_id, user_id) 32(或3),则查询甚至可以正常工作。但 会在结果中获得重复的行,并且需要应用user_idDISTINCT
唯一的条件是你制定的条件:

  

...一个查询,表明只有用户32和用户3之间有对话?

审核查询

query you linked in the comment不起作用。你忘了排除其他参与者。应该是这样的:

GROUP BY

与其他两个查询类似,只是如果多次链接SELECT * -- or whatever you want to return FROM conversation_user cu1 WHERE cu1.user_id = 32 AND EXISTS ( SELECT 1 FROM conversation_user cu2 WHERE cu2.conversation_id = cu1.conversation_id AND cu2.user_id = 3 ) AND NOT EXISTS ( SELECT 1 FROM conversation_user cu3 WHERE cu3.conversation_id = cu1.conversation_id AND cu3.user_id NOT IN (3,32) ); ,它将不会返回多行。

答案 1 :(得分:1)

您可以使用条件聚合来选择仅包含2个特定参与者的所有cid

select cid from conversation_user
group by cid
having count(*) = 2
and count(case when user_id not in (32,3) then 1 end) = 0

如果(cid,user_id)不唯一,请将having count(*) = 2替换为having count(distinct user_id) = 2

答案 2 :(得分:0)

因为您只想与2个用户进行对话,所以您可以在其他用户上使用自我外部联接并过滤掉匹配:

要查找所有2个用户的对话,他们介于:

之间
SELECT
    a.conversation_id cid,
    a.user_id user_id_1,
    b.user_id user_id_2
FROM conversation_user a
JOIN conversation_user b ON b.cid = a.cid
  AND b.user_id > a.user_id
LEFT JOIN conversation_user c ON c.cid = a.cid
  AND c.user_id NOT IN (a.user_id, b.user_id)
WHERE c.cid IS NULL -- only return misses on join to others

要查找特定用户的所有双用户对话,只需添加:

AND a.user_id = 32

答案 3 :(得分:0)

如果您只是想要确认。

 select conversation_id 
   from  conversation_users 
   group by conversation_id
   having bool_and ( user_id in (3,32))
      and count(*) = 2;

如果你想要完整的细节, 你可以使用这样的窗函数和CTE:

 with a as (
   select *
      ,not bool_and( user_id in (3,32) )
         over  ( partition by conversation_id) 
       and 2 = count(user_id)
         over  ( partition by conversation_id)
           as conv_candidates 
   from conversation_users 
   ) 
 select * from a where conv_candidates;