我不是SQL专家。如果有人可以帮助我通过。
我已经递归CTE来获取如下的值。
Child1 - >家长1
Parent1 - >家长2
Parent2 - > NULL
如果数据填充出错了,那么我将会有类似下面的内容,因为CTE可能会进入无限递归循环。给出最大递归误差。由于数据量巨大,我无法手动检查此错误数据。如果有办法找到它,请告诉我。
Child1 - >家长1
Parent1 - > Child1
或
Child1 - >家长1
Parent1 - > Parent2
Parent2 - > Child1
答案 0 :(得分:12)
使用Postgres,通过收集阵列中所有访问过的节点,可以很容易地防止这种情况发生。
设置:
create table hierarchy (id integer, parent_id integer);
insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3),
(5, 4),
(3, 5); -- endless loop
递归查询:
with recursive tree as (
select id,
parent_id,
array[id] as all_parents
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.all_parents||c.id
from hierarchy c
join tree p
on c.parent_id = p.id
and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;
答案 1 :(得分:3)
您还没有指定方言或列名,所以很难做出完美的例子......
-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2
(3, 2, 'SubChild')
-- End random data
;WITH RecursiveCTE (StartingID, Level, Parents, Loop, ID, ParentID, Description) AS
(
SELECT ID, 1, '|' + CAST(ID AS VARCHAR(MAX)) + '|', 0, * FROM #MyTable
UNION ALL
SELECT R.StartingID, R.Level + 1,
R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
MT.*
FROM #MyTable MT
INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)
SELECT StartingID, Level, Parents, MAX(Loop) OVER (PARTITION BY StartingID) Loop, ID, ParentID, Description
FROM RecursiveCTE
ORDER BY StartingID, Level
这样的事情将显示递归cte中是否存在循环。查看专栏Loop
。使用数据,没有循环。在注释中有关于如何更改值以引起循环的示例。
最后,递归cte以VARCHAR(MAX)
(称为|id1|id2|id3|
)的形式创建Parents
个ID,然后检查当前ID
是否已经在&{ #34;列表&#34 ;.如果是,则将Loop
列设置为1.在递归连接(ABD R.Loop = 0
)中检查此列。
结束查询使用MAX() OVER (PARTITION BY ...)
将Loop
列设置为整个&#34;块&#34;链子。
稍微复杂一点,会产生更好的&#34;报告:
-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
DROP TABLE #MyTable
CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2
(3, 3, 'SubChild')
-- End random data
-- The "terminal" childrens (that are elements that don't have childrens
-- connected to them)
;WITH WithoutChildren AS
(
SELECT MT1.* FROM #MyTable MT1
WHERE NOT EXISTS (SELECT 1 FROM #MyTable MT2 WHERE MT1.ID != MT2.ID AND MT1.ID = MT2.ParentID)
)
, RecursiveCTE (StartingID, Level, Parents, Descriptions, Loop, ParentID) AS
(
SELECT ID, -- StartingID
1, -- Level
'|' + CAST(ID AS VARCHAR(MAX)) + '|',
'|' + CAST(Description AS VARCHAR(MAX)) + '|',
0, -- Loop
ParentID
FROM WithoutChildren
UNION ALL
SELECT R.StartingID, -- StartingID
R.Level + 1, -- Level
R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
R.Descriptions + CAST(MT.Description AS VARCHAR(MAX)) + '|',
CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
MT.ParentID
FROM #MyTable MT
INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)
SELECT * FROM RecursiveCTE
WHERE ParentID IS NULL OR Loop = 1
此查询应返回所有&#34;最后一个孩子&#34;行,具有完整的父链。如果没有循环,则Loop
列为0
,如果存在循环,则1
。
答案 2 :(得分:3)
您可以使用Knuth描述的相同方法来检测链接列表中的循环。在一栏中,跟踪孩子,孩子的孩子,孩子的孩子等。在另一栏中,跟踪孙子孙女的孙子孙女。 ,孙子孙女的孙子等等。
对于初始选择,Child
和Grandchild
列之间的距离为1. union all
的每个选择都会将Child
的深度增加1,将Grandchild
的深度增加{ {1}} by 2.它们之间的距离增加1。
如果你有任何循环,因为距离每次只增加1,在循环Child
之后的某个时刻,距离将是循环长度的倍数。发生这种情况时,Child
和Grandchild
列是相同的。使用它作为附加条件来停止递归,并在其余代码中将其检测为错误。
SQL Server示例:
declare @LinkTable table (Parent int, Child int);
insert into @LinkTable values (1, 2), (1, 3), (2, 4), (2, 5), (3, 6), (3, 7), (7, 1);
with cte as (
select lt1.Parent, lt1.Child, lt2.Child as Grandchild
from @LinkTable lt1
inner join @LinkTable lt2 on lt2.Parent = lt1.Child
union all
select cte.Parent, lt1.Child, lt3.Child as Grandchild
from cte
inner join @LinkTable lt1 on lt1.Parent = cte.Child
inner join @LinkTable lt2 on lt2.Parent = cte.Grandchild
inner join @LinkTable lt3 on lt3.Parent = lt2.Child
where cte.Child <> cte.Grandchild
)
select Parent, Child
from cte
where Child = Grandchild;
删除导致循环的LinkTable
条记录之一,您会发现select
不再返回任何数据。
答案 3 :(得分:1)
这是SQL Server的解决方案:
表格插入脚本:
CREATE TABLE MyTable
(
[ID] INT,
[ParentID] INT,
[Name] NVARCHAR(255)
);
INSERT INTO MyTable
(
[ID],
[ParentID],
[Name]
)
VALUES
(1, NULL, 'A root'),
(2, NULL, 'Another root'),
(3, 1, 'Child of 1'),
(4, 3, 'Grandchild of 1'),
(5, 4, 'Great grandchild of 1'),
(6, 1, 'Child of 1'),
(7, 8, 'Child of 8'),
(8, 7, 'Child of 7'), -- This will cause infinite recursion
(9, 1, 'Child of 1');
查找罪魁祸首的确切记录的脚本:
;WITH RecursiveCTE
AS (
-- Get all parents:
-- Any record in MyTable table could be an Parent
-- We don't know here yet which record can involve in an infinite recursion.
SELECT ParentID AS StartID,
ID,
CAST(Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
FROM MyTable
UNION ALL
-- Recursively try finding all the childrens of above parents
-- Keep on finding it until this child become parent of above parent.
-- This will bring us back in the circle to parent record which is being
-- keep in the StartID column in recursion
SELECT RecursiveCTE.StartID,
t.ID,
CAST(RecursiveCTE.[ParentChildRelationPath] + ' -> ' + t.Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
FROM RecursiveCTE
INNER JOIN MyTable AS t
ON t.ParentID = RecursiveCTE.ID
WHERE RecursiveCTE.StartID != RecursiveCTE.ID)
-- FInd the ones which causes the infinite recursion
SELECT StartID,
[ParentChildRelationPath],
RecursiveCTE.ID
FROM RecursiveCTE
WHERE StartID = ID
OPTION (MAXRECURSION 0);
以上查询的输出:
答案 4 :(得分:0)
尝试限制递归结果
WITH EMP_CTE AS
(
SELECT
0 AS [LEVEL],
ManagerId, EmployeeId, Name
FROM Employees
WHERE ManagerId IS NULL
UNION ALL
SELECT
[LEVEL] + 1 AS [LEVEL],
ManagerId, EmployeeId, Name
FROM Employees e
INNER JOIN EMP_CTE c ON e.ManagerId = c.EmployeeId
AND s.LEVEL < 100 --RECURSION LIMIT
)
SELECT * FROM EMP_CTE WHERE [Level] = 100
答案 5 :(得分:0)
这是一种用于检测邻接表(父/子关系)中的循环的替代方法,在该列表中,节点只能有一个父节点,并且可以在子列(下表中的id
)上施加唯一约束来强制执行。这是通过递归查询为邻接表计算关闭表来实现的。首先,将每个节点作为自己的祖先添加到关闭表中,级别为0,然后迭代遍历邻接表以扩展关闭表。当新记录的子代和祖代在原始级别零(0)以外的任何级别相同时,将检测到周期:
-- For PostgreSQL and MySQL 8 use the Recursive key word in the CTE code:
-- with RECURSIVE cte(ancestor, child, lev, cycle) as (
with cte(ancestor, child, lev, cycle) as (
select id, id, 0, 0 from Table1
union all
select cte.ancestor
, Table1.id
, case when cte.ancestor = Table1.id then 0 else cte.lev + 1 end
, case when cte.ancestor = Table1.id then cte.lev + 1 else 0 end
from Table1
join cte
on cte.child = Table1.PARENT_ID
where cte.cycle = 0
) -- In oracle uncomment the next line
-- cycle child set isCycle to 'Y' default 'N'
select distinct
ancestor
, child
, lev
, max(cycle) over (partition by ancestor) cycle
from cte
给出表1的以下邻接表:
| parent_id | id |
|-----------|----|
| (null) | 1 |
| (null) | 2 |
| 1 | 3 |
| 3 | 4 |
| 1 | 5 |
| 2 | 6 |
| 6 | 7 |
| 7 | 8 |
| 9 | 10 |
| 10 | 11 |
| 11 | 9 |
以上在SQL Sever(以及按指示进行修改的Oracle,PostgreSQL和MySQL 8)上运行的查询正确地检测到节点9、10和11参与了长度为3的循环。
在各种数据库中证明这一点的SQL(/ DB)小提琴可以在下面找到: