为什么递归CTE在程序上运行分析函数(ROW_NUMBER)?

时间:2012-04-01 14:53:29

标签: sql sql-server common-table-expression row-number recursive-cte

我昨天回答了一个递归的CTE,它暴露了SQL Server中实现这些问题的方式的问题(也可能在其他RDBMS中?)。基本上,当我尝试对当前的递归级别使用ROW_NUMBER时,它会针对当前递归级别的每一行子集运行。我希望这可以在真正的SET逻​​辑中工作,并且针对整个当前递归级别运行。

看来from this MSDN article,我发现的问题是预期的功能:

  

CTE的递归部分中的分析和聚合函数是   应用于当前递归级别的集合而不是集合   对于CTE。 像ROW_NUMBER这样的函数只对子集有效   数据通过当前递归级别而不是整个传递给它们   数据集用于CTE的递归部分。更多   信息,参见J.在递归CTE中使用分析函数。

在我的挖掘中,我找不到任何地方可以解释为什么选择按照它的方式工作?这在基于集合的语言中更像是一种过程方法,因此这对我的SQL思维过程起作用,在我看来非常混乱。 是否有人知道和/或任何人都可以解释为什么递归CTE以递归方式处理递归级别的分析函数?


以下是帮助可视化的代码:

注意,每个代码输出中的RowNumber列。

Here is the SQLFiddle for the CTE (only showing the 2nd level of the recursion)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE
WHERE RecurseLevel = 2;

Here is the second SQLFiddle for what I would expect the CTE to do (again only need the 2nd level to display the issue)

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

我总是设想SQL递归CTE更像this while loop

DECLARE @RecursionLevel INT
SET @RecursionLevel = 0
SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, @RecursionLevel AS recurseLevel
INTO #RecursiveTable
FROM tblGroups
WHERE ParentId IS NULL

WHILE EXISTS( SELECT tblGroups.* FROM tblGroups JOIN #RecursiveTable ON #RecursiveTable.GroupID = tblGroups.ParentID WHERE recurseLevel = @RecursionLevel)
BEGIN

    INSERT INTO #RecursiveTable
    SELECT tblGroups.*, 
        ROW_NUMBER() OVER (ORDER BY #RecursiveTable.RowNumber , tblGroups.Score desc) AS RowNumber, 
        recurseLevel + 1 AS recurseLevel
    FROM tblGroups
        JOIN #RecursiveTable
            ON #RecursiveTable.GroupID = tblGroups.ParentID
    WHERE recurseLevel = @RecursionLevel
    SET @RecursionLevel = @RecursionLevel + 1
END

SELECT * FROM #RecursiveTable ORDER BY RecurseLevel;

1 个答案:

答案 0 :(得分:1)

分析函数在需要已知结果集来解析时非常特殊。 它们依赖于以下,前置或完整结果集来计算当前值。 也就是说,在包含分析函数的视图上永远不允许合并视图。为什么? 这将改变结果。

例如:

    Select * from (
      select row_number() over (partition by c1 order by c2) rw, c3 from t) z
    where c3=123

不同
    select row_number() over (partition by c1 order by c2) rw, c3 from t 
    where c3=123

这两个将返回rw的不同值。 这就是为什么包含分析函数的子查询将始终完全解析,并且永远不会与其余部分合并。

<强>更新

查看第二个查询:

WITH myCTE
AS
(
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
  FROM tblGroups
  WHERE ParentId IS NULL
 )
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID;

它的工作原理就像是(相同的执行计划和结果):

SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber , tblGroups.Score desc) AS RowNumber, 
      RecurseLevel + 1 AS RecurseLevel
FROM tblGroups
JOIN (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel
    FROM tblGroups
    WHERE ParentId IS NULL
    )myCTE ON myCTE.GroupID = tblGroups.ParentID;

需要对此进行分区以重置rownumber。

递归查询在while循环中不起作用,它们不是程序性的。在基础上,它们像递归函数一样工作,但根据表,查询,索引,它们可以被优化为以某种方式运行。

如果我们遵循以下概念:在使用分析函数时查看无法合并,并查看查询1.它只能运行一次,并且它是嵌套循环。

WITH myCTE
AS
( /*Cannot be merged*/
  SELECT *, ROW_NUMBER() OVER (ORDER BY Score desc) AS RowNumber, 1 AS RecurseLevel,
  cast(0 as bigint) n
  FROM tblGroups
  WHERE ParentId IS NULL

  UNION ALL

/*Cannot be merged*/
  SELECT tblGroups.*, 
      ROW_NUMBER() OVER (ORDER BY myCTE.RowNumber, tblGroups.Score desc) AS RowNumber,       RecurseLevel + 1 AS RecurseLevel,
  myCTE.RowNumber
  FROM tblGroups
      JOIN myCTE
          ON myCTE.GroupID = tblGroups.ParentID
 )
SELECT *
FROM myCTE;

所以第一次选择,不能合并第二次,也不是。运行此查询的唯一方法是在每个级别返回的每个项目的嵌套循环中,因此重置。同样,这不是程序问题,也不是可能的执行计划问题。

希望这能回答你的问题,如果没有,请告诉我。)

ý