循环的递归CTE停止条件

时间:2014-02-12 02:20:58

标签: sql postgresql graph common-table-expression recursive-query

我需要使用递归CTE迭代带有循环的图形。

问题是循环部分。

我想如果有循环,那么选择最短的路径。 这基本上意味着忽略循环,因为递归是“宽度优先”。

以下示例显示了返回的数据:

问题是注释掉INSERT语句,它创建了一个循环。 显然,如果没有注释,查询将永远不会完成。

我需要的是返回与没有循环时相同的数据。

DROP TABLE IF EXISTS edges;

CREATE TABLE edges(
  src integer,
  dst integer,
  data integer
);

INSERT INTO edges VALUES (1, 2, 1);
INSERT INTO edges VALUES (2, 3, 1);
--INSERT INTO edges VALUES (3, 2, 1);  -- This entry creates a loop
INSERT INTO edges VALUES (1, 4, 1);
INSERT INTO edges VALUES (4, 5, 1);
INSERT INTO edges VALUES (5, 2, 1);

INSERT INTO edges VALUES (1, 4, 2);
INSERT INTO edges VALUES (4, 5, 2);
INSERT INTO edges VALUES (4, 6, 2);


WITH RECURSIVE paths AS (
  -- For simplicity assume node 1 is the start
  -- we'll have two starting nodes for data = 1 and 2
  SELECT DISTINCT
    src           as node
    , data        as data
    , 0           as depth
    , src::text   as path
  FROM edges
  WHERE
    src = 1

  UNION ALL

  SELECT DISTINCT
    edges.dst
    , edges.data
    , depth + 1
    , paths.path || '->' || edges.dst::text
  FROM paths
    JOIN edges ON edges.src = paths.node AND edges.data = paths.data
    -- AND eliminate loops?
)

SELECT * FROM paths;

返回:

 node | data | depth |     path      
------+------+-------+---------------
    1 |    1 |     0 | 1
    1 |    2 |     0 | 1
    2 |    1 |     1 | 1->2
    4 |    1 |     1 | 1->4
    4 |    2 |     1 | 1->4
    3 |    1 |     2 | 1->2->3
    5 |    2 |     2 | 1->4->5
    6 |    2 |     2 | 1->4->6
    5 |    1 |     2 | 1->4->5
    2 |    1 |     3 | 1->4->5->2
    3 |    1 |     4 | 1->4->5->2->3
(11 rows)

2 个答案:

答案 0 :(得分:1)

WITH RECURSIVE paths AS (
    -- For simplicity assume node 1 is the start
    -- we'll have two starting nodes for data = 1 and 2
    SELECT DISTINCT
        src           as node
        , data        as data
        , 0           as depth
        , src::text   as path
        , ''          as edgeAdded   
    FROM edges
    WHERE
        src = 1

    UNION ALL

    SELECT DISTINCT
        edges.dst
        , edges.data
        , depth + 1
        , paths.path || '->' || edges.dst::text
        , edges.src::text || '->' || edges.dst::text
    FROM paths
    JOIN edges ON edges.src = paths.node AND edges.data = paths.data
    AND NOT paths.path LIKE '%' || edges.dst::text || '%' 
        -- AND eliminate loops?
)
SELECT * FROM paths;

在条件AND NOT paths.path LIKE '%' || edges.dst::text || '%'中,我们避免了会导致循环的后边缘 http://www.sqlfiddle.com/#!12/086ee/1

答案 1 :(得分:0)

处理循环的标准方法是在途中构建一个数组并检查元素是否已存在于其中:

WITH RECURSIVE paths AS (
  SELECT DISTINCT
    src           as node
    , data        as data
    , 0           as depth
    , src::text   as path
    , false       as is_cycle
    , ARRAY[src]  as path_array
  FROM edges
  WHERE src IN (1,2)
  UNION ALL
  SELECT DISTINCT
    edges.dst
    , edges.data
    , depth + 1
    , paths.path || '->' || edges.dst::text
    , dst = ANY(path_array)
    , path_array  || dst
  FROM paths
  JOIN edges 
    ON edges.src = paths.node 
    AND edges.data = paths.data
    AND NOT is_cycle
)
SELECT * FROM paths;

输出:

+-------+-------+--------+-------------------+-----------+---------------+
| node  | data  | depth  |       path        | is_cycle  |  path_array   |
+-------+-------+--------+-------------------+-----------+---------------+
|    1  |    1  |     0  | 1                 | f         | {1}           |
|    1  |    2  |     0  | 1                 | f         | {1}           |
|    2  |    1  |     0  | 2                 | f         | {2}           |
|    2  |    1  |     1  | 1->2              | f         | {1,2}         |
|    3  |    1  |     1  | 2->3              | f         | {2,3}         |
|    4  |    1  |     1  | 1->4              | f         | {1,4}         |
|    4  |    2  |     1  | 1->4              | f         | {1,4}         |
|    2  |    1  |     2  | 2->3->2           | t         | {2,3,2}       |
|    3  |    1  |     2  | 1->2->3           | f         | {1,2,3}       |
|    5  |    1  |     2  | 1->4->5           | f         | {1,4,5}       |
|    5  |    2  |     2  | 1->4->5           | f         | {1,4,5}       |
|    6  |    2  |     2  | 1->4->6           | f         | {1,4,6}       |
|    2  |    1  |     3  | 1->2->3->2        | t         | {1,2,3,2}     |
|    2  |    1  |     3  | 1->4->5->2        | f         | {1,4,5,2}     |
|    3  |    1  |     4  | 1->4->5->2->3     | f         | {1,4,5,2,3}   |
|    2  |    1  |     5  | 1->4->5->2->3->2  | t         | {1,4,5,2,3,2} |
+-------+-------+--------+-------------------+-----------+---------------+

db<>fiddle demo


PostgreSQL 14 将使用两个新子句 SEARCHCYCLE 扩展递归 cte:

<块引用>

Cycle Detection

CYCLE id SET is_cycle TO true DEFAULT false USING path

CYCLE 子句首先指定要跟踪以进行循环检测的列列表,然后是显示是否检测到循环的列名,然后是在该列中用于是和否情况的两个值,最后是将跟踪路径的另一列的名称。循环和路径列将隐式添加到 CTE 的输出行。

这里演示使用相同的语法(Oracle):

WITH paths(node, data,depth,path) AS (
  SELECT
    src           as node
    , data        as data
    , 0           as depth
    , TO_CHAR(src)as path
  FROM edges
  WHERE src IN (1,2)
  UNION ALL
  SELECT
    edges.dst
    , edges.data
    , depth + 1
    , paths.path || '->' || edges.dst
  FROM paths
  JOIN edges 
    ON edges.src = paths.node 
    AND edges.data = paths.data
) CYCLE node SET cycle TO 1 DEFAULT 0
SELECT DISTINCT * FROM paths;

db<>fiddle demo 2