在CTE中找到无限递归循环

时间:2015-07-31 06:07:05

标签: sql recursive-cte

我不是SQL专家。如果有人可以帮助我通过。

我已经递归CTE来获取如下的值。

Child1 - >家长1

Parent1 - >家长2

Parent2 - > NULL

如果数据填充出错了,那么我将会有类似下面的内容,因为CTE可能会进入无限递归循环。给出最大递归误差。由于数据量巨大,我无法手动检查此错误数据。如果有办法找到它,请告诉我。

Child1 - >家长1

Parent1 - > Child1

Child1 - >家长1

Parent1 - > Parent2

Parent2 - > Child1

6 个答案:

答案 0 :(得分:12)

使用Postgres,通过收集阵列中所有访问过的节点,可以很容易地防止这种情况发生。

设置:

create table hierarchy (id integer, parent_id integer);

insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3), 
(5, 4), 
(3, 5); -- endless loop

递归查询:

with recursive tree as (
  select id, 
         parent_id, 
         array[id] as all_parents
  from hierarchy
  where parent_id is null

  union all

  select c.id, 
         c.parent_id,
         p.all_parents||c.id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
     and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;

答案 1 :(得分:3)

您还没有指定方言或列名,所以很难做出完美的例子......

-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
    DROP TABLE #MyTable

CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2 
(3, 2, 'SubChild')
-- End random data

;WITH RecursiveCTE (StartingID, Level, Parents, Loop, ID, ParentID, Description) AS
(
    SELECT ID, 1, '|' + CAST(ID AS VARCHAR(MAX)) + '|', 0, * FROM #MyTable
    UNION ALL
    SELECT R.StartingID, R.Level + 1, 
        R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
        CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
        MT.*
        FROM #MyTable MT
        INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)

SELECT StartingID, Level, Parents, MAX(Loop) OVER (PARTITION BY StartingID) Loop, ID, ParentID, Description 
    FROM RecursiveCTE 
    ORDER BY StartingID, Level

这样的事情将显示递归cte中是否存在循环。查看专栏Loop。使用数据,没有循环。在注释中有关于如何更改值以引起循环的示例。

最后,递归cte以VARCHAR(MAX)(称为|id1|id2|id3|)的形式创建Parents个ID,然后检查当前ID是否已经在&{ #34;列表&#34 ;.如果是,则将Loop列设置为1.在递归连接(ABD R.Loop = 0)中检查此列。

结束查询使用MAX() OVER (PARTITION BY ...)Loop列设置为整个&#34;块&#34;链子。

稍微复杂一点,会产生更好的&#34;报告:

-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
    DROP TABLE #MyTable

CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2 
(3, 3, 'SubChild')
-- End random data

-- The "terminal" childrens (that are elements that don't have childrens
-- connected to them)
;WITH WithoutChildren AS
(
    SELECT MT1.* FROM #MyTable MT1
        WHERE NOT EXISTS (SELECT 1 FROM #MyTable MT2 WHERE MT1.ID != MT2.ID AND MT1.ID = MT2.ParentID)
)

, RecursiveCTE (StartingID, Level, Parents, Descriptions, Loop, ParentID) AS
(
    SELECT ID, -- StartingID 
        1, -- Level
        '|' + CAST(ID AS VARCHAR(MAX)) + '|', 
        '|' + CAST(Description AS VARCHAR(MAX)) + '|', 
        0, -- Loop
        ParentID
        FROM WithoutChildren
    UNION ALL
    SELECT R.StartingID, -- StartingID
        R.Level + 1, -- Level
        R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
        R.Descriptions + CAST(MT.Description AS VARCHAR(MAX)) + '|', 
        CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
        MT.ParentID
        FROM #MyTable MT
        INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)

SELECT * FROM RecursiveCTE 
    WHERE ParentID IS NULL OR Loop = 1

此查询应返回所有&#34;最后一个孩子&#34;行,具有完整的父链。如果没有循环,则Loop列为0,如果存在循环,则1

答案 2 :(得分:3)

您可以使用Knuth描述的相同方法来检测链接列表中的循环。在一栏中,跟踪孩子,孩子的孩子,孩子的孩子等。在另一栏中,跟踪孙子孙女的孙子孙女。 ,孙子孙女的孙子等等。

对于初始选择,ChildGrandchild列之间的距离为1. union all的每个选择都会将Child的深度增加1,将Grandchild的深度增加{ {1}} by 2.它们之间的距离增加1。

如果你有任何循环,因为距离每次只增加1,在循环Child之后的某个时刻,距离将是循环长度的倍数。发生这种情况时,ChildGrandchild列是相同的。使用它作为附加条件来停止递归,并在其余代码中将其检测为错误。

SQL Server示例:

declare @LinkTable table (Parent int, Child int);
insert into @LinkTable values (1, 2), (1, 3), (2, 4), (2, 5), (3, 6), (3, 7), (7, 1);

with cte as (
    select lt1.Parent, lt1.Child, lt2.Child as Grandchild
    from @LinkTable lt1
    inner join @LinkTable lt2 on lt2.Parent = lt1.Child
    union all
    select cte.Parent, lt1.Child, lt3.Child as Grandchild
    from cte
    inner join @LinkTable lt1 on lt1.Parent = cte.Child
    inner join @LinkTable lt2 on lt2.Parent = cte.Grandchild
    inner join @LinkTable lt3 on lt3.Parent = lt2.Child
    where cte.Child <> cte.Grandchild
)
select Parent, Child
from cte
where Child = Grandchild;

删除导致循环的LinkTable条记录之一,您会发现select不再返回任何数据。

答案 3 :(得分:1)

这是SQL Server的解决方案:

表格插入脚本:

CREATE TABLE MyTable
(
    [ID] INT,
    [ParentID] INT,
    [Name] NVARCHAR(255)
);

INSERT INTO MyTable
(
    [ID],
    [ParentID],
    [Name]
)
VALUES
(1, NULL, 'A root'),
(2, NULL, 'Another root'),
(3, 1, 'Child of 1'),
(4, 3, 'Grandchild of 1'),
(5, 4, 'Great grandchild of 1'),
(6, 1, 'Child of 1'),
(7, 8, 'Child of 8'),
(8, 7, 'Child of 7'), -- This will cause infinite recursion
(9, 1, 'Child of 1');

查找罪魁祸首的确切记录的脚本:

;WITH RecursiveCTE
AS (
   -- Get all parents: 
   -- Any record in MyTable table could be an Parent
   -- We don't know here yet which record can involve in an infinite recursion.
   SELECT ParentID AS StartID,
          ID,
          CAST(Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
   FROM MyTable
   UNION ALL

   -- Recursively try finding all the childrens of above parents
   -- Keep on finding it until this child become parent of above parent.
   -- This will bring us back in the circle to parent record which is being
   -- keep in the StartID column in recursion
   SELECT RecursiveCTE.StartID,
          t.ID,
          CAST(RecursiveCTE.[ParentChildRelationPath] + ' -> ' + t.Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
   FROM RecursiveCTE
       INNER JOIN MyTable AS t
           ON t.ParentID = RecursiveCTE.ID
   WHERE RecursiveCTE.StartID != RecursiveCTE.ID)

-- FInd the ones which causes the infinite recursion
SELECT StartID,
       [ParentChildRelationPath],
       RecursiveCTE.ID
FROM RecursiveCTE
WHERE StartID = ID
OPTION (MAXRECURSION 0);

以上查询的输出:

enter image description here

答案 4 :(得分:0)

尝试限制递归结果

WITH EMP_CTE AS
( 

    SELECT 
        0 AS [LEVEL],   
        ManagerId, EmployeeId, Name
    FROM Employees
    WHERE ManagerId IS NULL

    UNION ALL

    SELECT 
        [LEVEL] + 1 AS [LEVEL],
        ManagerId, EmployeeId, Name
    FROM Employees e
    INNER JOIN EMP_CTE c ON e.ManagerId = c.EmployeeId 
 AND s.LEVEL < 100 --RECURSION LIMIT
) 

    SELECT  * FROM EMP_CTE WHERE [Level] = 100

答案 5 :(得分:0)

这是一种用于检测邻接表(父/子关系)中的循环的替代方法,在该列表中,节点只能有一个父节点,并且可以在子列(下表中的id)上施加唯一约束来强制执行。这是通过递归查询为邻接表计算关闭表来实现的。首先,将每个节点作为自己的祖先添加到关闭表中,级别为0,然后迭代遍历邻接表以扩展关闭表。当新记录的子代和祖代在原始级别零(0)以外的任何级别相同时,将检测到周期:

-- For PostgreSQL and MySQL 8 use the Recursive key word in the CTE code:
-- with RECURSIVE cte(ancestor, child, lev, cycle) as (

with cte(ancestor, child, lev, cycle) as (
  select id, id, 0, 0 from Table1
  union all
  select cte.ancestor
       , Table1.id
       , case when cte.ancestor = Table1.id then 0 else cte.lev + 1 end
       , case when cte.ancestor = Table1.id then cte.lev + 1 else 0 end
    from Table1
    join cte
      on cte.child = Table1.PARENT_ID
   where cte.cycle = 0
) -- In oracle uncomment the next line
-- cycle child set isCycle to 'Y' default 'N'
select distinct
       ancestor
     , child
     , lev
     , max(cycle) over (partition by ancestor) cycle
  from cte

给出表1的以下邻接表:

| parent_id | id |
|-----------|----|
|    (null) |  1 |
|    (null) |  2 |
|         1 |  3 |
|         3 |  4 |
|         1 |  5 |
|         2 |  6 |
|         6 |  7 |
|         7 |  8 |
|         9 | 10 |
|        10 | 11 |
|        11 |  9 |

以上在SQL Sever(以及按指示进行修改的Oracle,PostgreSQL和MySQL 8)上运行的查询正确地检测到节点9、10和11参与了长度为3的循环。

在各种数据库中证明这一点的SQL(/ DB)小提琴可以在下面找到: