Question

我遇到了this CTE solution for concatenating row elements，我认为这很棒，我意识到CTE有多强大。

然而，为了有效地使用这样的工具，我需要知道它是如何在内部工作来构建心理图像，这对初学者来说是必不可少的，就像我一样，在不同场景中使用它。

所以我尝试慢动作上面代码片段的过程，这里是代码

USE [NORTHWIND]
GO
/****** Object:  Table [dbo].[Products2]  Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL  DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
  [ProductID] [int] IDENTITY(1,1) NOT NULL,
  [ProductName] [nvarchar](40) NOT NULL,
  [SupplierID] [int] NULL,
  [CategoryID] [int] NULL,
  [QuantityPerUnit] [nvarchar](20) NULL,
  [UnitPrice] [money] NULL,
  [UnitsInStock] [smallint] NULL,
  [UnitsOnOrder] [smallint] NULL,
  [ReorderLevel] [smallint] NULL,
  [Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL  DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
  DECLARE @I INT=0
  WHILE @I<99999
  BEGIN
  SELECT @I+=1
  END
  RETURN GETDATE()
END
GO

WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
     AS (SELECT dbo.DELAY_EXEC(),
                CategoryID,
                CAST('' AS VARCHAR(8000)),
                CAST('' AS VARCHAR(8000)),
                0
         FROM   Northwind..Products2
         GROUP  BY CategoryID
         UNION ALL
         SELECT dbo.DELAY_EXEC(),
                p.CategoryID,
                CAST(product_list + CASE
                                      WHEN length = 0 THEN ''
                                      ELSE ', '
                                    END + ProductName AS VARCHAR(8000)),
                CAST(ProductName AS VARCHAR(8000)),
                length + 1
         FROM   CTE c
                INNER JOIN Northwind..Products2 p
                  ON c.CategoryID = p.CategoryID
         WHERE  p.ProductName > c.product_name)
SELECT *
FROM   CTE
ORDER  BY EXEC_TIME  

--SELECT CategoryId, product_list
--  FROM ( SELECT CategoryId, product_list,
--  RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
--   FROM CTE ) D ( CategoryId, product_list, rank )
--   WHERE rank = 1 ;

评论块是连接问题的理想输出，但这不是问题。

我添加了一个EXEC_TIME列，以了解首先添加了哪一行。由于两个原因，输出看起来并不正确

我认为会有冗余数据，因为条件p.ProductName > c.product_name在另一个单词中CTE的第一部分空行总是少于Product2表中的值所以每次运行它应该再次带来一组新添加的行。这有什么意义吗？
数据的层次结构真的很奇怪，最后一项应该是最长的，看看最后一项是什么？包含length=1的项目

任何救援专家？提前谢谢。

样本结果

EXEC_TIME               CategoryID  product_list                                                        product_name                      length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1                                                                                                                 0
2011-10-18 12:46:14.990 2                                                                                                                 0
2011-10-18 12:46:15.050 4                                                                                                                 0
2011-10-18 12:46:15.107 4           vcbcbvcbvc                                                          vcbcbvcbvc                        1
2011-10-18 12:46:15.167 2           Aniseed Syrup                                                       Aniseed Syrup                     1
2011-10-18 12:46:15.223 2           Chef Anton's Cajun Seasoning                                        Chef Anton's Cajun Seasoning      1
2011-10-18 12:46:15.280 2           Chef Anton's Gumbo Mix                                              Chef Anton's Gumbo Mix            1
2011-10-18 12:46:15.340 2           Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix                Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.400 2           Aniseed Syrup, Chef Anton's Cajun Seasoning                         Chef Anton's Cajun Seasoning      2
2011-10-18 12:46:15.463 2           Aniseed Syrup, Chef Anton's Gumbo Mix                               Chef Anton's Gumbo Mix            2
2011-10-18 12:46:15.520 2           Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi  Chef Anton's Gumbo Mix            3
2011-10-18 12:46:15.580 1           Changassad                                                          Changassad                        1

Answer 1

这是一个有趣的问题，帮助我更好地理解递归CTE。

如果查看执行计划，您将看到使用了一个假脱机，并且它设置了WITH STACK属性。这意味着rows are read in a stack-like manner (Last In First Out)

首先，锚点部分运行

EXEC_TIME               CategoryID  product_list  
----------------------- ----------- --------------
2011-10-18 12:46:14.930 1                         
2011-10-18 12:46:14.990 2                         
2011-10-18 12:46:15.050 4

然后处理4，因为这是添加的最后一行。 JOIN返回添加到假脱机的1行，然后处理这个新添加的行。在这种情况下，Join不返回任何内容，因此没有任何其他内容添加到假脱机中，并继续处理CategoryID = 2行。

返回3行，添加到假脱机

Aniseed Syrup
Chef Anton's Cajun Seasoning
Chef Anton's Gumbo Mix

然后以类似的LIFO方式依次处理这些行中的每一行，并且在处理可以移动到兄弟行之前首先处理添加的任何子行。希望你能看到这个递归逻辑如何解释你观察到的结果，但万一你不能进行C#模拟

using System;
using System.Collections.Generic;
using System.Linq;

namespace Foo
{
    internal class Bar
    {
        private static void Main(string[] args)
        {
            var spool = new Stack<Tuple<int, string, string>>();

            //Add anchor elements
            AddRowToSpool(spool, new Tuple<int, string, string>(1, "", ""));
            AddRowToSpool(spool, new Tuple<int, string, string>(2, "", ""));
            AddRowToSpool(spool, new Tuple<int, string, string>(4, "", ""));

            while (spool.Count > 0)
            {
                Tuple<int, string, string> lastRowAdded = spool.Pop();
                AddChildRows(lastRowAdded, spool);
            }

            Console.ReadLine();
        }

    private static void AddRowToSpool(Stack<Tuple<int, string, string>> spool,
                                      Tuple<int, string, string> row)
        {
            Console.WriteLine("CategoryId={0}, product_list = {1}",
                              row.Item1,
                              row.Item3);
            spool.Push(row);
        }

    private static void AddChildRows(Tuple<int, string, string> lastRowAdded,
                                     Stack<Tuple<int, string, string>> spool)
        {
            int categoryId = lastRowAdded.Item1;
            string productName = lastRowAdded.Item2;
            string productList = lastRowAdded.Item3;

            string[] products;

            switch (categoryId)
            {
                case 1:
                    products = new[] {"Changassad"};
                    break;
                case 2:
                    products = new[]
                                   {
                                       "Aniseed Syrup",
                                       "Chef Anton's Cajun Seasoning",
                                       "Chef Anton's Gumbo Mix "
                                   };
                    break;
                case 4:
                    products = new[] {"vcbcbvcbvc"};
                    break;
                default:
                    products = new string[] {};
                    break;
            }


            foreach (string product in products.Where(
                product => string.Compare(productName, product) < 0))
            {
                string product_list = string.Format("{0}{1}{2}",
                                                 productList,
                                                 productList == "" ? "" : ",",
                                                 product);

                AddRowToSpool(spool,
                              new Tuple<int, string, string>
                                  (categoryId, product, product_list));
            }
        }
    }
}

返回

CategoryId=1, product_list =
CategoryId=2, product_list =
CategoryId=4, product_list =
CategoryId=4, product_list = vcbcbvcbvc
CategoryId=2, product_list = Aniseed Syrup
CategoryId=2, product_list = Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Chef Anton's Gumbo Mix
CategoryId=2, product_list = Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=1, product_list = Changassad

Answer 2

页面Recursive Queries Using Common Table Expressions描述了CTE的逻辑：

递归执行的语义如下：



将CTE表达式拆分为锚点和递归成员。



运行锚定成员创建第一个调用或基本结果集（T0）。



以Ti作为输入并以Ti + 1作为输出运行递归成员。



重复步骤3，直到返回空集。



返回结果集。这是T0到Tn的UNION ALL。

但是，这只是逻辑流程。与往常一样，使用SQL，如果结果“相同”，服务器可以自由地重新排序操作，如果结果“相同”，则可以认为重新排序可以更有效地提供结果。

在决定是否重新排序操作时，通常会考虑具有副作用的函数（导致延迟，然后返回GETDATE()）。

可以重新排序查询的一种显而易见的方法是，它可能决定在完全创建结果集Ti+1之前开始处理结果集Ti - 执行此操作可能更有效而不是首先完全构造Ti，因为新行肯定已经在内存中并且最近已被访问过。

CTE如何真正起作用？

样本结果

2 个答案: