给出一个用户表:
User(id INT, username VARCHAR(30))
并指导他们之间的关系:
Following(follower_id INT, followee_id INT)
我需要为所有独特的三位一体用户提供SELECT,例如:
A follows B
B follows A
A follows C
C not follows A
B not follows C
C follows B
我正在使用SQLite数据库并使用Python。有了上面的示例SELECT,我可能很快就可以完成我追求的所有其他三元组。这些基本上是三个用户中定向连接的所有可能组合。
答案 0 :(得分:1)
这有点复杂,但你可以这样做:
with pairs as (
select f1.followee_id, f1.follower_id
from following f1 join
following f2
on f1.follower_id = f2.followee_id and
f1.followee_id = f2.follower_id
)
select p1.followee as A, p1.follower as B, p3.followee as C
from pairs p1 join
pairs p2
on p1.followee_id = p2.followee_id join
pairs p3
on p3.followee_id = p1.follower_id and
p3.follower_id = p2.follower_id;
这个想法是pairs
获得彼此跟随的用户对。然后寻找添加第三个人的其他对。
另一种方法是生成所有组合,然后选择匹配的组合:
select a.id, b.id, c.id
from users a join
users b
on a.id < b.id join
users c
on b.id < c.id
where exists (select 1 from following f where f.follower_id = a.id and f.followee_id = b.id) and
exists (select 1 from following f where f.follower_id = b.id and f.followee_id = a.id) and
exists (select 1 from following f where f.follower_id = a.id and f.followee_id = c.id) and
exists (select 1 from following f where f.follower_id = c.id and f.followee_id = a.id) and
exists (select 1 from following f where f.follower_id = b.id and f.followee_id = c.id) and
exists (select 1 from following f where f.follower_id = c.id and f.followee_id = b.id);
如果您在表上设置了合理的索引,则此版本可能实际上具有更好的性能。
编辑:
对于性能,following
表应该在follower_id, followee_id
上有索引 - 这是一个包含两列的复合索引。
答案 1 :(得分:0)
SELECT ab.follower_id AS a_id,
ab.followee_id AS b_id,
ac.followee_id AS c_id
FROM following AS ab
JOIN following AS ba ON ab.followee_id = ba.follower_id
AND ab.follower_id = ba.followee_id
JOIN following AS ac ON ab.follower_id = ac.follower_id
JOIN following AS cb ON ac.followee_id = cb.follower_id
AND ab.followee_id = cb.followee_id
LEFT OUTER JOIN following AS ca ON ac.followee_id = ca.follower_id
AND ac.follower_id = ca.followee_id
LEFT OUTER JOIN following AS bc ON cb.followee_id = bc.follower_id
AND cb.follower_id = bc.followee_id
WHERE ab.follower_id < ab.followee_id
AND ab.followee_id < ac.followee_id
AND ca.follower_id IS NULL
AND bc.follower_id IS NULL
300万条记录在30秒内执行,相比之下,戈登提出的EXIST
版本为45k秒。