SQL Server:如何将CTE递归限制为仅递归添加的行?

时间:2009-03-11 15:08:34

标签: sql-server common-table-expression

更简单的例子

让我们尝试一个更简单的示例,这样人们可以围绕这些概念,并有一个实际示例,您可以将其复制并粘贴到SQL Query Analizer中:

想象一下节点表,其中包含一个层次结构:

A
 - B
    - C

我们可以在Query Analizer中开始测试:

CREATE TABLE ##Nodes
(
 NodeID varchar(50) PRIMARY KEY NOT NULL,
 ParentNodeID varchar(50) NULL
)

INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('A', null)
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('B', 'A')
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('C', 'B')

期望的输出:

ParentNodeID    NodeID    GenerationsRemoved
============    ======    ==================
NULL            A         1
NULL            B         2
NULL            C         3
A               B         1
A               C         2
B               C         1

现在建议的CTE表达式,输出错误:

WITH NodeChildren AS
(
   --initialization
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
   FROM ##Nodes
   WHERE ParentNodeID IS NULL

   UNION ALL

   --recursive execution
   SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
   FROM NodeChildren AS P
      INNER JOIN ##Nodes AS N
      ON P.NodeID = N.ParentNodeID
)
SELECT ParentNodeID, NodeID, GenerationsRemoved
FROM NodeChildren

实际输出

ParentNodeID    NodeID    GenerationsRemoved
============    ======    ==================
NULL            A         1
NULL            B         2
NULL            C         3

注意:如果SQL Server 2005†CTE无法完成2000年以前的工作,那很好,那就是答案。任何给出“不可能”作为答案的人都将赢得赏金。但是我会等几天,以确保每个人都同意这是不可能的,然后我无可挽回地为我的问题解决了250点的声誉。

Nitpickers Corner

†不是2008年

‡不使用UDF *,这是已有的解决方案

*除非您能在原始问题中看到提高UDF性能的方法


原始问题

我有一个节点表,每个节点都有一个指向另一个节点的父节点(或者为空)。

例如:

1 My Computer
    2 Drive C
         4 Users
         5 Program Files
         7 Windows
             8 System32
    3 Drive D
         6 mp3

我想要一个表,它返回所有父子关系,以及它们之间的代数

对于所有直接父母关系:

ParentNodeID  ChildNodeID  GenerationsRemoved
============  ===========  ===================
(null)        1            1
1             2            1
2             4            1
2             5            1
2             7            1
1             3            1
3             6            1
7             8            1

然后就是祖父母的关系:

ParentNodeID  ChildNodeID  GenerationsRemoved
============  ===========  ===================
(null)        2            2
(null)        3            2
1             4            2
1             5            2
1             7            2
1             6            2
2             8            2

那里有曾祖父母的关系:

ParentNodeID  ChildNodeID  GenerationsRemoved
============  ===========  ===================
(null)        4            3
(null)        5            3
(null)        7            3
(null)        6            3
1             8            3

所以我可以找出基本的CTE初始化:

WITH (NodeChildren) AS
{
   --initialization
   SELECT ParentNodeID, NodeID AS ChildNodeID, 1 AS GenerationsRemoved
   FROM Nodes
} 

现在的问题是递归部分。当然,明显的答案是行不通的:

WITH (NodeChildren) AS
{
   --initialization
   SELECT ParentNodeID, ChildNodeID, 1 AS GenerationsRemoved
   FROM Nodes

   UNION ALL

   --recursive execution
   SELECT parents.ParentNodeID, children.NodeID, parents.Generations+1
   FROM NodeChildren parents
    INNER JOIN NodeParents children
    ON parents.NodeID = children.ParentNodeID
} 

Msg 253, Level 16, State 1, Line 1
Recursive member of a common table expression 'NodeChildren' has multiple recursive references.

生成整个递归列表所需的所有信息都存在于初始CTE表中。但如果不允许,我会尝试:

WITH (NodeChildren) AS
{
   --initialization
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
   FROM Nodes

   UNION ALL

   --recursive execution
   SELECT parents.ParentNodeID, Nodes.NodeID, parents.Generations+1
   FROM NodeChildren parents
    INNER JOIN Nodes
    ON parents.NodeID = nodes.ParentNodeID
} 

但是失败了,因为它不仅加入了递归元素,而且一遍又一遍地重复添加相同的行:

Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

在SQL Server 2000中,我使用用户定义函数(UDF)模拟CTE:

CREATE FUNCTION [dbo].[fn_NodeChildren] ()
RETURNS @Result TABLE (
    ParentNodeID int NULL,
    ChildNodeID int NULL,
    Generations int NOT NULL) 
AS  
/*This UDF returns all "ParentNode" - "Child Node" combinations
    ...even multiple levels separated
BEGIN 
    DECLARE @Generations int
    SET @Generations = 1

    --Insert into the Return table all "Self" entries
    INSERT INTO @Result
    SELECT ParentNodeID, NodeID, @Generations
    FROM Nodes
    WHILE @@rowcount > 0 
    BEGIN
        SET @Generations = @Generations + 1
        --Add to the Children table: 
        --  children of all nodes just added 
        -- (i.e. Where @Result.Generation = CurrentGeneration-1)
        INSERT @Result
        SELECT CurrentParents.ParentNodeID, Nodes.NodeID, @Generations
        FROM Nodes
            INNER JOIN @Result CurrentParents
            ON Nodes.ParentNodeID = CurrentParents.ChildNodeID
        WHERE CurrentParents.Generations = @Generations - 1
    END
    RETURN
END

保持它不被炸毁的魔力是限制where子句:     在哪里CurrentParents.Generations - @ Generations-1

如何防止递归CTE永远递归?

8 个答案:

答案 0 :(得分:19)

试试这个:

WITH Nodes AS
(
   --initialization
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
   FROM ##Nodes

   UNION ALL

   ----recursive execution
   SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
   FROM Nodes AS P
      INNER JOIN ##Nodes AS N
      ON P.NodeID = N.ParentNodeID
   WHERE P.GenerationsRemoved <= 10

)
SELECT ParentNodeID, NodeID, GenerationsRemoved
FROM Nodes
ORDER BY ParentNodeID, NodeID, GenerationsRemoved

基本上从初始化查询中删除“only show me absolute parents”;这样它就可以从每个结果开始生成结果并从那里下降。我还在“WHERE P.GenerationsRemoved&lt; = 10”中添加了无限递归捕获(将10替换为任何数字,最多100个以满足您的需要)。然后添加排序,使其看起来像您想要的结果。

答案 1 :(得分:0)

除此之外:你有SQL Server 2008吗?这可能适合hierarchyid data type

答案 2 :(得分:0)

如果我了解你的意图,你可以通过做这样的事情来获得结果:

DECLARE @StartID INT;
SET @StartID = 1;
WITH CTE (ChildNodeID, ParentNodeID, [Level]) AS
(
  SELECT  t1.ChildNodeID, 
          t1.ParentNodeID, 
          0
  FROM tblNodes AS t1
  WHERE ChildNodeID = @StartID
  UNION ALL
  SELECT  t1.ChildNodeID, 
          t1.ParentNodeID, 
          t2.[Level]+1
  FROM tblNodes AS t1
    INNER JOIN CTE AS t2 ON t1.ParentNodeID = t2.ChildNodeID    
)
SELECT t1.ChildNodeID, t2.ChildNodeID, t1.[Level]- t2.[Level] AS GenerationsDiff
FROM CTE AS t1
  CROSS APPLY CTE t2

这将返回所有节点之间的代差异,您可以根据具体需要对其进行修改。

答案 3 :(得分:0)

嗯,你的回答并不那么明显: - )

WITH (NodeChildren) AS
{
   --initialization
   SELECT ParentNodeID, ChildNodeID, 1 AS GenerationsRemoved
   FROM Nodes

这部分被称为递归CTE的“锚点”部分 - 但它实际上只应从表中选择一行或几行 - 这会选择所有内容!

我想你在这里缺少的只是一个合适的WHERE子句:

WITH (NodeChildren) AS
{
   --initialization
   SELECT ParentNodeID, ChildNodeID, 1 AS GenerationsRemoved
   FROM Nodes
   **WHERE ParentNodeID IS NULL**

然而,我担心你的要求不只是“直”层次结构,而且还有祖父母子行,可能不那么容易满足....通常递归CTE只会显示一个级别及其直接下属(当然是层次结构) - 它通常不会跳过一个,两个甚至更多的层次。

希望这有点帮助。

马克

答案 4 :(得分:0)

问题在于Sql Server默认递归限制(100)。如果您在顶部尝试示例并删除了锚点限制(也添加了Order By):

WITH NodeChildren AS
(
   --initialization
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
   FROM Nodes

   UNION ALL

   --recursive execution
   SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
   FROM NodeChildren AS P
      inner JOIN Nodes AS N
      ON P.NodeID = N.ParentNodeID
)
SELECT ParentNodeID, NodeID, GenerationsRemoved
FROM NodeChildren
ORDER BY ParentNodeID ASC

这会产生预期的效果。您面临的问题是,您将重新记录超过100次的行数,这是默认限制。这可以通过在查询后添加option (max recursion x)来更改,其中x是介于1和32767之间的数字.x也可以设置为0,它不设置限制,但很快就会对服务器性能产生非常不利的影响。很明显,当节点中的行数增加时,递归的次数会很快增加,除非表中的行有已知的上限,否则我会避免这种方法。为完整起见,最终查询应如下所示:

 WITH NodeChildren AS
    (
       --initialization
       SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
       FROM Nodes

       UNION ALL

       --recursive execution
       SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
       FROM NodeChildren AS P
          inner JOIN Nodes AS N
          ON P.NodeID = N.ParentNodeID
    )
    SELECT * 
    FROM NodeChildren
    ORDER BY ParentNodeID
    OPTION (MAXRECURSION 32767)

可以向下调整32767以适合您的场景

答案 5 :(得分:0)

您是否尝试在CTE中构建路径并使用它来识别祖先?

然后,您可以从祖先节点深度中减去后代节点深度,以计算GenerationsRemoved列,如此...

DECLARE @Nodes TABLE
(
    NodeId varchar(50) PRIMARY KEY NOT NULL,
    ParentNodeId varchar(50) NULL
)

INSERT INTO @Nodes (NodeId, ParentNodeId) VALUES ('A', NULL)
INSERT INTO @Nodes (NodeId, ParentNodeId) VALUES ('B', 'A')
INSERT INTO @Nodes (NodeId, ParentNodeId) VALUES ('C', 'B')

DECLARE @Hierarchy TABLE
(
    NodeId varchar(50) PRIMARY KEY NOT NULL,
    ParentNodeId varchar(50) NULL,
    Depth int NOT NULL,
    [Path] varchar(2000) NOT NULL
)

WITH Hierarchy AS
(
    --initialization
    SELECT NodeId, ParentNodeId, 0 AS Depth, CONVERT(varchar(2000), NodeId) AS [Path]
    FROM @Nodes
    WHERE ParentNodeId IS NULL

    UNION ALL

    --recursive execution
    SELECT n.NodeId, n.ParentNodeId, p.Depth + 1, CONVERT(varchar(2000), p.[Path] + '/' + n.NodeId)
    FROM Hierarchy AS p
    INNER JOIN @Nodes AS n
    ON p.NodeId = n.ParentNodeId
)
INSERT INTO @Hierarchy
SELECT *
FROM Hierarchy

SELECT parent.NodeId AS AncestorNodeId, child.NodeId AS DescendantNodeId, child.Depth - parent.Depth AS GenerationsRemoved
FROM @Hierarchy AS parent
INNER JOIN @Hierarchy AS child
ON child.[Path] LIKE parent.[Path] + '/%'

答案 6 :(得分:0)

这打破了对Chris Shaffer的回答所施加的递归限制。

我创建了一个带循环的表:

CREATE TABLE ##Nodes
(
   NodeID varchar(50) PRIMARY KEY NOT NULL,
   ParentNodeID varchar(50) NULL
)

INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('A', 'C');
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('B', 'A');
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('C', 'B');

在存在潜在周期的情况下(即ParentNodeId IS NOT NULL),删除的生成从2开始。然后我们可以通过检查(P.ParentNodeID == N.NodeID)来识别周期,我们只需要#&# 39;添加它。然后,我们追加省略的生成remove = 1。

WITH ParentNodes AS
(
   --initialization
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
   FROM ##Nodes
   WHERE ParentNodeID IS NULL

   UNION ALL

   SELECT P.ParentNodeID, N.NodeID, 2 AS GenerationsRemoved
   FROM ##Nodes N
   JOIN ##Nodes P ON N.ParentNodeID=P.NodeID
   WHERE P.ParentNodeID IS NOT NULL

   UNION ALL

   ----recursive execution
   SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
   FROM ParentNodes AS P
     INNER JOIN ##Nodes AS N
     ON P.NodeID = N.ParentNodeID
   WHERE P.ParentNodeID IS NULL OR P.ParentNodeID <> N.NodeID

),
Nodes AS (
   SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved 
   FROM ##Nodes 
   WHERE ParentNodeID IS NOT NULL

   UNION ALL

   SELECT ParentNodeID, NodeID, GenerationsRemoved FROM ParentNodes
)
SELECT ParentNodeID, NodeID, GenerationsRemoved
FROM Nodes
ORDER BY ParentNodeID, NodeID, GenerationsRemoved

答案 7 :(得分:0)

with cte as
(
    select a=65, L=1
    union all
    select a+1, L=L+1
    from cte
    where L<=100
)
select 
IsRecursion=Case When L>1 then 'Recursion' else 'Not Recursion' end,
AsciiValue=a,
AsciiCharacter=char(a)
from cte
  1. 创建包含当前级别的列。
  2. 检查级别是否> 1
  3. 我的示例显示了一个递归CTE,它在100个级别(最大值)后停止递归。作为奖励,它会显示一堆ASCII字符和相应的数值。