SQL查询,如何找到最常见朋友的朋友?

时间:2015-02-08 20:03:02

标签: sql oracle

我有两张桌子。 用户表:

USERS(ID,NAME)

朋友关系:

FRIEND(ID1,ID2)

我希望找到拥有最多共享朋友的用户对,而这两个用户不是朋友。

最后,我想打印两个用户的名字对。 一个例子是:

用户表:

(1,Jimmy)
(2,Sam)
(3,Alices)
(4,Tom)

朋友表:

(1,2)
(1,3)
(4,2)
(4,3)

用户1和4有共同的朋友2,3。用户2和3有共同的朋友1,4。两对朋友都有共享朋友的数量2.所以我们要打印他们的名字作为结果:

Jimmy,Tom
Sam,Alices

如何在一个查询中执行此操作?

2 个答案:

答案 0 :(得分:1)

我正在使用SQL Server进行测试,因为我手边只有SQL Server,但应该直接将其转换为Oracle语法。

我已经使用SQL Fiddle将其转换为Oracle,尽管我之前从未见过Oracle。请参阅底部的最终查询。

示例数据

DECLARE @USERS TABLE (ID int, NAME nvarchar(255));

DECLARE @FRIEND TABLE (ID1 int, ID2 int);

INSERT INTO @USERS (ID, NAME) VALUES (1, 'Jimmy');
INSERT INTO @USERS (ID, NAME) VALUES (2, 'Sam');
INSERT INTO @USERS (ID, NAME) VALUES (3, 'Alice');
INSERT INTO @USERS (ID, NAME) VALUES (4, 'Tom');

INSERT INTO @FRIEND (ID1, ID2) VALUES (1,2);
INSERT INTO @FRIEND (ID1, ID2) VALUES (1,3);
INSERT INTO @FRIEND (ID1, ID2) VALUES (4,2);
INSERT INTO @FRIEND (ID1, ID2) VALUES (4,3);

成对用户

我们需要成对的用户。这是由CROSS JOIN完成的。 CROSS JOIN会返回两倍于我们需要(1,2) and (2,1)的行,但我们只需要其中一行,因此我们将按用户ID添加过滤器。

WITH
CTE_Pairs
AS
(
    SELECT
        U1.ID AS ID1
        ,U2.ID AS ID2
    FROM
        @USERS AS U1
        CROSS JOIN @USERS AS U2
    WHERE
        U1.ID > U2.ID
)
SELECT *
FROM CTE_Pairs;

结果集:

ID1    ID2
2      1
3      1
4      1
3      2
4      2
4      3

不是朋友的对

一旦我们拥有所有对,我们就应该删除那些已经成为朋友的对。表格FRIEND可以将一对列为(1,2)(2,1),因此我们应该检查这两种可能性。我们将使用EXCEPT来“减去”这些行。

....
,CTE_PairsNonFriends
AS
(
    SELECT ID1, ID2
    FROM CTE_Pairs

    EXCEPT

    SELECT ID1, ID2
    FROM @FRIEND

    EXCEPT

    SELECT ID2, ID1
    FROM @FRIEND
)
SELECT *
FROM CTE_PairsNonFriends;

结果集:

ID1    ID2
3      2
4      1

所选用户的朋友

我们有最终的对名单。对于每个用户,我们需要获得他的直接朋友列表。简单join就足够了。表friend可以有(1,2)(2,1),因此我们需要执行两次。我们首先为用户ID1执行此操作,然后分别为用户ID2执行此操作。

....
,CTE_FriendsOfUser1
AS
(
    SELECT
        CTE_PairsNonFriends.ID1 AS IDUser1
        ,F1.ID2 AS FriendOfUser1
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID1 = CTE_PairsNonFriends.ID1

    UNION -- sic! not ALL

    SELECT
        CTE_PairsNonFriends.ID1 AS IDUser1
        ,F1.ID1 AS FriendOfUser1
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID2 = CTE_PairsNonFriends.ID1
)
,CTE_FriendsOfUser2
AS
(
    SELECT
        CTE_PairsNonFriends.ID2 AS IDUser2
        ,F1.ID2 AS FriendOfUser2
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID1 = CTE_PairsNonFriends.ID2

    UNION -- sic! not ALL

    SELECT
        CTE_PairsNonFriends.ID2 AS IDUser2
        ,F1.ID1 AS FriendOfUser2
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID2 = CTE_PairsNonFriends.ID2
)

结果集:

SELECT * FROM CTE_FriendsOfUser1

IDUser1    FriendOfUser1
4          2
4          3
3          1
3          4


SELECT * FROM CTE_FriendsOfUser2

IDUser2    FriendOfUser2
1          2
1          3
2          1
2          4

相互朋友

join user1与他们的朋友列表中的user2找到他们共同的朋友。

....
,CTE_MutualFriends
AS
(
    SELECT *
    FROM
        CTE_FriendsOfUser1
        INNER JOIN CTE_FriendsOfUser2 ON CTE_FriendsOfUser2.FriendOfUser2 = CTE_FriendsOfUser1.FriendOfUser1
    WHERE
        CTE_FriendsOfUser1.IDUser1 <> CTE_FriendsOfUser2.IDUser2
)

统计共同的朋友

,CTE_FriendCount
AS
(
    SELECT
        IDUser1
        ,IDUser2
        ,COUNT(*) AS FriendCount
    FROM CTE_MutualFriends
    GROUP BY IDUser1, IDUser2
)

使用用户名的最终完整查询

按朋友计数订购结果。您只能返回第一行(或前几行)以返回具有最多共同朋友的用户。实际上,应该TOP与关系。

WITH
CTE_Pairs
AS
(
    SELECT
        U1.ID AS ID1
        ,U2.ID AS ID2
    FROM
        @USERS AS U1
        CROSS JOIN @USERS AS U2
    WHERE
        U1.ID > U2.ID
)
,CTE_PairsNonFriends
AS
(
    SELECT ID1, ID2
    FROM CTE_Pairs

    EXCEPT

    SELECT ID1, ID2
    FROM @FRIEND

    EXCEPT

    SELECT ID2, ID1
    FROM @FRIEND
)
,CTE_FriendsOfUser1
AS
(
    SELECT
        CTE_PairsNonFriends.ID1 AS IDUser1
        ,F1.ID2 AS FriendOfUser1
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID1 = CTE_PairsNonFriends.ID1

    UNION -- sic! not ALL

    SELECT
        CTE_PairsNonFriends.ID1 AS IDUser1
        ,F1.ID1 AS FriendOfUser1
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID2 = CTE_PairsNonFriends.ID1
)
,CTE_FriendsOfUser2
AS
(
    SELECT
        CTE_PairsNonFriends.ID2 AS IDUser2
        ,F1.ID2 AS FriendOfUser2
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID1 = CTE_PairsNonFriends.ID2

    UNION -- sic! not ALL

    SELECT
        CTE_PairsNonFriends.ID2 AS IDUser2
        ,F1.ID1 AS FriendOfUser2
    FROM
        CTE_PairsNonFriends
        INNER JOIN @FRIEND AS F1 ON F1.ID2 = CTE_PairsNonFriends.ID2
)
,CTE_MutualFriendsRaw
AS
(
    SELECT
        CTE_FriendsOfUser1.FriendOfUser1 AS MutualFriend
        ,IDUser1
        ,IDUser2
    FROM
        CTE_FriendsOfUser1
        INNER JOIN CTE_FriendsOfUser2 ON CTE_FriendsOfUser2.FriendOfUser2 = CTE_FriendsOfUser1.FriendOfUser1
    WHERE
        CTE_FriendsOfUser1.IDUser1 <> CTE_FriendsOfUser2.IDUser2
)
,CTE_MutualFriends
AS
(
    SELECT DISTINCT
        MutualFriend
        ,CASE WHEN IDUser1 < IDUser2 THEN IDUser1 ELSE IDUser2 END AS IDUser1
        ,CASE WHEN IDUser1 < IDUser2 THEN IDUser2 ELSE IDUser1 END AS IDUser2
    FROM
        CTE_MutualFriendsRaw
)
,CTE_FriendCount
AS
(
    SELECT
        IDUser1
        ,IDUser2
        ,COUNT(*) AS FriendCount
    FROM CTE_MutualFriends
    GROUP BY IDUser1, IDUser2
)
SELECT
    CTE_FriendCount.IDUser1
    ,CTE_FriendCount.IDUser2
    ,CTE_FriendCount.FriendCount
    ,U1.NAME AS Name1
    ,U2.NAME AS Name2
FROM
    CTE_FriendCount
    INNER JOIN @USERS AS U1 ON U1.ID = CTE_FriendCount.IDUser1
    INNER JOIN @USERS AS U2 ON U2.ID = CTE_FriendCount.IDUser2
ORDER BY FriendCount DESC
;

结果集:

IDUser1    IDUser2    FriendCount    Name1    Name2
4          1          2              Tom      Jimmy
3          2          2              Alice    Sam

CTE_MutualFriends可能存在问题。同样的问题是一对可以列为(1,2)(2,1)。例如,您可以将(a,b)与计数NN配对,将(b,a)与另一个计数MM配对。严格地说,应该有另一个步骤来寻找这些对并将它们组合在一起。我不确定当前的查询是否可以使用这些对。

<{1}}的原始版本存在问题,因此我添加了额外的步骤以消除查询的最终完整版本中的重复项。给定的样本数据太小且不易拥有所有可能的变体,因此版本版本给出了正确的结果。如果我们在示例数据中添加更多条目,我们将看到需要额外的步骤。

Oracle语法版

选中http://sqlfiddle.com/#!4/48e1f/21/0

CTE_MutualFriends

答案 1 :(得分:0)

我想你想要这样的东西:

WITH uf AS (
    SELECT id1 AS user_id, id2 AS friend_id FROM friends
     UNION ALL
    SELECT id2 AS user_id, id1 AS friend_id FROM friends
), xf AS (
    SELECT user_id1, user_id2, friend_cnt FROM (
        SELECT uf1.user_id AS user_id1, uf2.user_id AS user_id2
             , COUNT(*) AS friend_cnt
             , RANK() OVER ( ORDER BY COUNT(*) DESC ) AS rn
          FROM uf uf1 INNER JOIN uf uf2
            ON uf1.friend_id = uf2.friend_id
           AND uf1.user_id < uf2.user_id
         GROUP BY uf1.user_id, uf2.user_id
    ) WHERE rn = 1
)
SELECT xf.friend_cnt, u1.username || ',' || u2.username
  FROM xf INNER JOIN users u1
    ON xf.user_id1 = u1.user_id
 INNER JOIN users u2
    ON xf.user_id2 = u2.user_id;

在第一次CTE中,我和他们的朋友一起接待用户;在第二个我得到了与朋友共同的用户,然后按计数排名;在主查询中,我只是获取用户名并将它们连接起来。

Please see SQL Fiddle demo here.请注意,尽管Jimmy和Tom在演示中各有四个朋友,但friend_cnt的值为3,因为这是他们共同的朋友数量(我添加了几个朋友)到你的样本数据)。