我有一个社交网络表。表的名称是RELATION_TABLE。 我有三列。 userid_1,userid_2,relationtypecode(如亲密朋友,家人,熟人,大学朋友等)
表格结构和样本记录:
DROP table RELATION_TABLE;
create table RELATION_TABLE
(
USER_ID_1 NUMBER,
USER_ID_2 NUMBER,
RELATION_TYPE_CODE VARCHAR2(100)
);
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(1,2,'CLOSE FRIEND');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(4,1,'HIGH SCHOOL FRIEND');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(5,2,'FAMILY MEMBER');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(1,6,'COLLEAGUE');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(3,4,'PARTNER');
INSERT INTO RELATION_TABLE(USER_ID_1,USER_ID_2,RELATION_TYPE_CODE)
VALUES(3,6,'COLLEAGUE');
COMMIT;
样本记录:
USER_ID_1 USER_ID_2 RELATION_TYPE_CODE
1 2 CLOSE FRIEND
4 1 HIGH SCHOOL FRIEND
5 2 FAMILY MEMBER
1 6 COLLEAGUE
3 4 WIFE
3 6 COLLEAGUE
根据样本记录:用户: 1与4有关系,4与3有关系,最后4与6有关系,所以1可能与4,3和6有关系。
所以我需要编写递归查询以插入所有可能的关系。 我之前尝试过使用connect,但是没有直接关系,比如child-parent关系。任何用户标识都可以存在于USER_ID_1列或USER_ID_2列中。可能有循环,我也需要忽略这些循环。
你有任何方法建议吗?
由于
答案 0 :(得分:1)
由于您的数据集没有关系的方向性,如果您想获得所有传递关系,您需要处理以USIR_ID_1
- >开头的关系链。 USER_ID_2
或USER_ID_2
- > USER_ID_1
。
正如你所提到的那样你是11g,递归子因素对你来说可能是一个选择,但是因为直到11gR2都没有,所以在这个例子中我会避免使用CONNECT BY
。
在您的示例中,您希望总体上获得user # 1
到users # 3,4,6
的关系记录。 (可能是用户#2,来自你所包含的CLOSE FRIEND
关系)
要组合这些关系,可以尝试首先从USER_ID_1
- 根关系树的组合查询开始,再加上USER_ID_2
与NOCYCLE
的根关系树来忽略循环(但这样不会工作):
SELECT
CONNECT_BY_ROOT USER_ID_2 AS STARTING_USER_ID,
USER_ID_1 AS RELATED_USER_ID
FROM RELATION_TABLE
START WITH USER_ID_2 = 1
CONNECT BY NOCYCLE PRIOR USER_ID_1 = USER_ID_2
UNION
SELECT
CONNECT_BY_ROOT USER_ID_1 AS STARTING_USER_ID,
USER_ID_2 AS RELATED_USER_ID
FROM RELATION_TABLE
START WITH USER_ID_1 = 1
CONNECT BY NOCYCLE PRIOR USER_ID_2 = USER_ID_1
ORDER BY 1 ASC, 2 ASC;
结果:
STARTING_USER_ID RELATED_USER_ID
1 2
1 3
1 4
1 6
这似乎很接近(它包含您提到的三种关系,以及1 -> 2
user # 1
一侧的USER_ID_1
关系
但仔细观察数据,却缺少关系。如果您查看记录,user # 2
已与user # 5
相关联,user # 1
已与user # 2
相关联,因此user # 1
也应与{{1}相关联}}。我相信你在帖子中指出了这一点 - 没有直接的亲子关系(网络没有方向性,但是查询有方向性)
要实现此目的,一种(低效)方法是查询user # 5
和a -> b
关系的组合集 - 将数据集加倍,以便分层查询可以继续进行,就好像关系是定向。
在以下查询中,b -> a
现在可以浏览user # 1
以连接到user # 2
。此查询的一个副作用是它创建必须删除的人为自我关系。在提供的示例中,存在user # 5
以添加真正的自我关系。
为了空间,我在这里使用UNION ALL
来压缩结果。
LISTAGG
结果:
WITH PSEUDO_DIRECTED_RELATION AS (
SELECT
USER_ID_1 AS LEFT_ID,
USER_ID_2 AS RIGHT_ID
FROM RELATION_TABLE
UNION
SELECT
USER_ID_2 AS LEFT_ID,
USER_ID_1 AS RIGHT_ID
FROM RELATION_TABLE)
SELECT STARTING_ID, LISTAGG(RELATED_ID,',') WITHIN GROUP (ORDER BY RELATED_ID ASC) AS RELATED_USERS
FROM (
SELECT
DISTINCT
CONNECT_BY_ROOT RIGHT_ID AS STARTING_ID,
LEFT_ID AS RELATED_ID
FROM PSEUDO_DIRECTED_RELATION
WHERE LEFT_ID <> CONNECT_BY_ROOT RIGHT_ID
START WITH RIGHT_ID = 1
CONNECT BY NOCYCLE PRIOR LEFT_ID = RIGHT_ID
UNION ALL
SELECT
USER_ID_1 AS STARTING_ID,
USER_ID_2 AS RELATED_ID
FROM RELATION_TABLE
WHERE USER_ID_1 = USER_ID_2
AND USER_ID_1 = 1)
GROUP BY STARTING_ID
ORDER BY 1 ASC;
现在STARTING_ID RELATED_USERS
1 2,3,4,5,6
已通过user # 1
与user # 5
相关联
但也许这只是将所有内容与所有内容联系起来,所以让我们添加更多数据:
user # 2
然后重新运行上面的查询。 INSERT INTO RELATION_TABLE VALUES (7,9,'Siblings');
INSERT INTO RELATION_TABLE VALUES (7,13,'Pen Pals');
INSERT INTO RELATION_TABLE VALUES (22,7,'Colleagues');
不应与user # 1
相关(根本不相关)。
user # 7
现在,如果我们将用户#7与自己联系起来
STARTING_ID RELATED_USERS
1 2,3,4,5,6
重新定位定位INSERT INTO RELATION_TABLE VALUES (7,7,'Self');
而不是user # 7
(更改user # 1
等):
START WITH
如果您不想查询单个root用户,可以删除STARTING_ID RELATED_USERS
7 7,9,13,22
和自我关系谓词。
START WITH
显示每个用户的所有可传递相关用户的结果:
WITH PSEUDO_DIRECTED_RELATION AS (
SELECT
USER_ID_1 AS LEFT_ID,
USER_ID_2 AS RIGHT_ID
FROM RELATION_TABLE
UNION
SELECT
USER_ID_2 AS LEFT_ID,
USER_ID_1 AS RIGHT_ID
FROM RELATION_TABLE)
SELECT STARTING_ID, LISTAGG(RELATED_ID,',') WITHIN GROUP (ORDER BY RELATED_ID ASC) AS RELATED_USERS
FROM (
SELECT
DISTINCT
CONNECT_BY_ROOT RIGHT_ID AS STARTING_ID,
LEFT_ID AS RELATED_ID
FROM PSEUDO_DIRECTED_RELATION
WHERE LEFT_ID <> CONNECT_BY_ROOT RIGHT_ID
CONNECT BY NOCYCLE PRIOR LEFT_ID = RIGHT_ID
UNION ALL
SELECT
USER_ID_1 AS STARTING_ID,
USER_ID_2 AS RELATED_ID
FROM RELATION_TABLE
WHERE USER_ID_1 = USER_ID_2)
GROUP BY STARTING_ID
ORDER BY 1 ASC, 2 ASC;
答案 1 :(得分:1)
随着版本。
WITH m AS
(SELECT USER_ID_1 u1, USER_ID_2 u2 FROM RELATION_TABLE
UNION
SELECT USER_ID_2, USER_ID_1 FROM RELATION_TABLE),
recur (usr, fri) AS
(SELECT u1, u1 FROM m
UNION ALL
SELECT r.usr, u2 FROM recur r, m WHERE r.fri = m.u1)
CYCLE fri SET cycle TO 1 DEFAULT 0
SELECT usr,
listagg(fri, ',') within GROUP (ORDER BY fri) friends
FROM (SELECT DISTINCT usr, fri FROM recur WHERE usr != fri AND cycle = 0)
GROUP BY usr;