用于朋友分析的PLSQL递归查询

时间:2017-05-13 12:06:50

标签: sql oracle oracle11g recursive-query

我有一个社交网络表。表的名称是RELATION_TABLE。 我有三列。 userid_1,userid_2,relationtypecode(如亲密朋友,家人,熟人,大学朋友等)

表格结构和样本记录:

DROP table RELATION_TABLE;
create table RELATION_TABLE
(
    USER_ID_1 NUMBER,
    USER_ID_2 NUMBER,
    RELATION_TYPE_CODE VARCHAR2(100) 
);

INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(1,2,'CLOSE FRIEND');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(4,1,'HIGH SCHOOL FRIEND');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(5,2,'FAMILY MEMBER');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(1,6,'COLLEAGUE');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(3,4,'PARTNER');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE) 
VALUES(3,6,'COLLEAGUE');
COMMIT;

样本记录:

USER_ID_1    USER_ID_2    RELATION_TYPE_CODE
1              2           CLOSE FRIEND
4              1           HIGH SCHOOL FRIEND
5              2           FAMILY MEMBER
1              6           COLLEAGUE
3              4           WIFE
3              6           COLLEAGUE

根据样本记录:用户: 1与4有关系,4与3有关系,最后4与6有关系,所以1可能与4,3和6有关系。

所以我需要编写递归查询以插入所有可能的关系。 我之前尝试过使用connect,但是没有直接关系,比如child-parent关系。任何用户标识都可以存在于USER_ID_1列或USER_ID_2列中。可能有循环,我也需要忽略这些循环。

你有任何方法建议吗?

由于

2 个答案:

答案 0 :(得分:1)

由于您的数据集没有关系的方向性,如果您想获得所有传递关系,您需要处理以USIR_ID_1 - >开头的关系链。 USER_ID_2USER_ID_2 - > USER_ID_1

正如你所提到的那样你是11g,递归子因素对你来说可能是一个选择,但是因为直到11gR2都没有,所以在这个例子中我会避免使用CONNECT BY

在您的示例中,您希望总体上获得user # 1users # 3,4,6的关系记录。 (可能是用户#2,来自你所包含的CLOSE FRIEND关系)
要组合这些关系,可以尝试首先从USER_ID_1 - 根关系树的组合查询开始,再加上USER_ID_2NOCYCLE的根关系树来忽略循环(但这样不会工作):

SELECT
  CONNECT_BY_ROOT USER_ID_2 AS STARTING_USER_ID,
  USER_ID_1                 AS RELATED_USER_ID
FROM RELATION_TABLE
START WITH USER_ID_2 = 1
CONNECT BY NOCYCLE PRIOR USER_ID_1 = USER_ID_2
UNION
SELECT
  CONNECT_BY_ROOT USER_ID_1 AS STARTING_USER_ID,
  USER_ID_2                 AS RELATED_USER_ID
FROM RELATION_TABLE
START WITH USER_ID_1 = 1
CONNECT BY NOCYCLE PRIOR USER_ID_2 = USER_ID_1
ORDER BY 1 ASC, 2 ASC;

结果:

STARTING_USER_ID  RELATED_USER_ID  
1                 2                
1                 3                
1                 4                
1                 6            

这似乎很接近(它包含您提到的三种关系,以及1 -> 2 user # 1一侧的USER_ID_1关系

但仔细观察数据,却缺少关系。如果您查看记录,user # 2已与user # 5相关联,user # 1已与user # 2相关联,因此user # 1也应与{{1}相关联}}。我相信你在帖子中指出了这一点 - 没有直接的亲子关系(网络没有方向性,但是查询有方向性)

要实现此目的,一种(低效)方法是查询user # 5a -> b关系的组合集 - 将数据集加倍,以便分层查询可以继续进行,就好像关系是定向。

在以下查询中,b -> a现在可以浏览user # 1以连接到user # 2。此查询的一个副作用是它创建必须删除的人为自我关系。在提供的示例中,存在user # 5以添加真正的自我关系。

为了空间,我在这里使用UNION ALL来压缩结果。

LISTAGG

结果:

WITH PSEUDO_DIRECTED_RELATION AS (
  SELECT
    USER_ID_1 AS LEFT_ID,
    USER_ID_2 AS RIGHT_ID
  FROM RELATION_TABLE
  UNION
  SELECT
    USER_ID_2 AS LEFT_ID,
    USER_ID_1 AS RIGHT_ID
  FROM RELATION_TABLE)
SELECT STARTING_ID, LISTAGG(RELATED_ID,',') WITHIN GROUP (ORDER BY RELATED_ID ASC) AS RELATED_USERS
FROM (
SELECT
  DISTINCT
  CONNECT_BY_ROOT RIGHT_ID AS STARTING_ID,
  LEFT_ID                  AS RELATED_ID
FROM PSEUDO_DIRECTED_RELATION
WHERE LEFT_ID <> CONNECT_BY_ROOT RIGHT_ID
START WITH RIGHT_ID = 1
CONNECT BY NOCYCLE PRIOR LEFT_ID = RIGHT_ID
UNION ALL
SELECT
  USER_ID_1 AS STARTING_ID,
  USER_ID_2 AS RELATED_ID
FROM RELATION_TABLE
WHERE USER_ID_1 = USER_ID_2
      AND USER_ID_1 = 1)
  GROUP BY STARTING_ID
ORDER BY 1 ASC;

现在STARTING_ID RELATED_USERS 1 2,3,4,5,6 已通过user # 1user # 5相关联 但也许这只是将所有内容与所有内容联系起来,所以让我们添加更多数据:

user # 2

然后重新运行上面的查询。 INSERT INTO RELATION_TABLE VALUES (7,9,'Siblings'); INSERT INTO RELATION_TABLE VALUES (7,13,'Pen Pals'); INSERT INTO RELATION_TABLE VALUES (22,7,'Colleagues'); 不应与user # 1相关(根本不相关)。

user # 7

现在,如果我们将用户#7与自己联系起来

STARTING_ID  RELATED_USERS  
1            2,3,4,5,6      

重新定位定位INSERT INTO RELATION_TABLE VALUES (7,7,'Self'); 而不是user # 7(更改user # 1等):

START WITH

如果您不想查询单个root用户,可以删除STARTING_ID RELATED_USERS 7 7,9,13,22 和自我关系谓词。

START WITH

显示每个用户的所有可传递相关用户的结果:

WITH PSEUDO_DIRECTED_RELATION AS (
  SELECT
    USER_ID_1 AS LEFT_ID,
    USER_ID_2 AS RIGHT_ID
  FROM RELATION_TABLE
  UNION
  SELECT
    USER_ID_2 AS LEFT_ID,
    USER_ID_1 AS RIGHT_ID
  FROM RELATION_TABLE)
SELECT STARTING_ID, LISTAGG(RELATED_ID,',') WITHIN GROUP (ORDER BY RELATED_ID ASC) AS RELATED_USERS
  FROM (
SELECT
  DISTINCT
  CONNECT_BY_ROOT RIGHT_ID AS STARTING_ID,
  LEFT_ID                  AS RELATED_ID
FROM PSEUDO_DIRECTED_RELATION
WHERE LEFT_ID <> CONNECT_BY_ROOT RIGHT_ID
CONNECT BY NOCYCLE PRIOR LEFT_ID = RIGHT_ID
UNION ALL
SELECT
  USER_ID_1 AS STARTING_ID,
  USER_ID_2 AS RELATED_ID
FROM RELATION_TABLE
WHERE USER_ID_1 = USER_ID_2)
  GROUP BY STARTING_ID
ORDER BY 1 ASC, 2 ASC;

答案 1 :(得分:1)

随着版本。

WITH m AS
          (SELECT USER_ID_1 u1, USER_ID_2 u2 FROM RELATION_TABLE
           UNION
           SELECT USER_ID_2, USER_ID_1 FROM RELATION_TABLE),
     recur (usr, fri) AS
          (SELECT u1, u1 FROM m
           UNION ALL
           SELECT r.usr, u2 FROM recur r, m WHERE r.fri = m.u1)
           CYCLE fri SET cycle TO 1 DEFAULT 0
SELECT    usr,
         listagg(fri, ',') within GROUP (ORDER BY fri) friends
FROM (SELECT DISTINCT usr, fri FROM recur WHERE usr != fri AND cycle = 0)
GROUP BY  usr;