我遇到了this CTE solution for concatenating row elements,我认为这很棒,我意识到CTE有多强大。
然而,为了有效地使用这样的工具,我需要知道它是如何在内部工作来构建心理图像,这对初学者来说是必不可少的,就像我一样,在不同场景中使用它。
所以我尝试慢动作上面代码片段的过程,这里是代码
USE [NORTHWIND]
GO
/****** Object: Table [dbo].[Products2] Script Date: 10/18/2011 08:55:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF OBJECT_ID('Products2','U') IS NOT NULL DROP TABLE [Products2]
CREATE TABLE [dbo].[Products2](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[ProductName] [nvarchar](40) NOT NULL,
[SupplierID] [int] NULL,
[CategoryID] [int] NULL,
[QuantityPerUnit] [nvarchar](20) NULL,
[UnitPrice] [money] NULL,
[UnitsInStock] [smallint] NULL,
[UnitsOnOrder] [smallint] NULL,
[ReorderLevel] [smallint] NULL,
[Discontinued] [bit] NOT NULL
) ON [PRIMARY]
GO
SET IDENTITY_INSERT [dbo].[Products2] ON
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (1, N'vcbcbvcbvc', 1, 4, N'10 boxes x 20 bags', 18.0000, 39, 0, 10, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (2, N'Changassad', 1, 1, N'24 - 12 oz bottles', 19.0000, 17, 40, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (3, N'Aniseed Syrup', 1, 2, N'12 - 550 ml bottles', 10.0000, 13, 70, 25, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (4, N'Chef Anton''s Cajun Seasoning', 2, 2, N'48 - 6 oz jars', 22.0000, 53, 0, 0, 0)
INSERT [dbo].[Products2] ([ProductID], [ProductName], [SupplierID], [CategoryID], [QuantityPerUnit], [UnitPrice], [UnitsInStock], [UnitsOnOrder], [ReorderLevel], [Discontinued]) VALUES (5, N'Chef Anton''s Gumbo Mix', 10, 2, N'36 boxes', 21.3500, 0, 0, 0, 1)
SET IDENTITY_INSERT [dbo].[Products2] OFF
GO
IF OBJECT_ID('DELAY_EXEC','FN') IS NOT NULL DROP FUNCTION DELAY_EXEC
GO
CREATE FUNCTION DELAY_EXEC() RETURNS DATETIME
AS
BEGIN
DECLARE @I INT=0
WHILE @I<99999
BEGIN
SELECT @I+=1
END
RETURN GETDATE()
END
GO
WITH CTE (EXEC_TIME, CategoryID, product_list, product_name, length)
AS (SELECT dbo.DELAY_EXEC(),
CategoryID,
CAST('' AS VARCHAR(8000)),
CAST('' AS VARCHAR(8000)),
0
FROM Northwind..Products2
GROUP BY CategoryID
UNION ALL
SELECT dbo.DELAY_EXEC(),
p.CategoryID,
CAST(product_list + CASE
WHEN length = 0 THEN ''
ELSE ', '
END + ProductName AS VARCHAR(8000)),
CAST(ProductName AS VARCHAR(8000)),
length + 1
FROM CTE c
INNER JOIN Northwind..Products2 p
ON c.CategoryID = p.CategoryID
WHERE p.ProductName > c.product_name)
SELECT *
FROM CTE
ORDER BY EXEC_TIME
--SELECT CategoryId, product_list
-- FROM ( SELECT CategoryId, product_list,
-- RANK() OVER ( PARTITION BY CategoryId ORDER BY length DESC )
-- FROM CTE ) D ( CategoryId, product_list, rank )
-- WHERE rank = 1 ;
评论块是连接问题的理想输出,但这不是问题。
我添加了一个EXEC_TIME列,以了解首先添加了哪一行。 由于两个原因,输出看起来并不正确
我认为会有冗余数据,因为条件p.ProductName > c.product_name
在另一个单词中CTE的第一部分空行总是少于Product2表中的值所以每次运行它应该再次带来一组新添加的行。这有什么意义吗?
数据的层次结构真的很奇怪,最后一项应该是最长的,看看最后一项是什么?包含length=1
的项目
任何救援专家?提前谢谢。
EXEC_TIME CategoryID product_list product_name length
----------------------- ----------- ------------------------------------------------------------------- --------------------------------- -----------
2011-10-18 12:46:14.930 1 0
2011-10-18 12:46:14.990 2 0
2011-10-18 12:46:15.050 4 0
2011-10-18 12:46:15.107 4 vcbcbvcbvc vcbcbvcbvc 1
2011-10-18 12:46:15.167 2 Aniseed Syrup Aniseed Syrup 1
2011-10-18 12:46:15.223 2 Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 1
2011-10-18 12:46:15.280 2 Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 1
2011-10-18 12:46:15.340 2 Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.400 2 Aniseed Syrup, Chef Anton's Cajun Seasoning Chef Anton's Cajun Seasoning 2
2011-10-18 12:46:15.463 2 Aniseed Syrup, Chef Anton's Gumbo Mix Chef Anton's Gumbo Mix 2
2011-10-18 12:46:15.520 2 Aniseed Syrup, Chef Anton's Cajun Seasoning, Chef Anton's Gumbo Mi Chef Anton's Gumbo Mix 3
2011-10-18 12:46:15.580 1 Changassad Changassad 1
答案 0 :(得分:5)
这是一个有趣的问题,帮助我更好地理解递归CTE。
如果查看执行计划,您将看到使用了一个假脱机,并且它设置了WITH STACK
属性。这意味着rows are read in a stack-like manner (Last In First Out)
首先,锚点部分运行
EXEC_TIME CategoryID product_list
----------------------- ----------- --------------
2011-10-18 12:46:14.930 1
2011-10-18 12:46:14.990 2
2011-10-18 12:46:15.050 4
然后处理4
,因为这是添加的最后一行。 JOIN
返回添加到假脱机的1行,然后处理这个新添加的行。在这种情况下,Join不返回任何内容,因此没有任何其他内容添加到假脱机中,并继续处理CategoryID = 2
行。
返回3行,添加到假脱机
Aniseed Syrup
Chef Anton's Cajun Seasoning
Chef Anton's Gumbo Mix
然后以类似的LIFO方式依次处理这些行中的每一行,并且在处理可以移动到兄弟行之前首先处理添加的任何子行。希望你能看到这个递归逻辑如何解释你观察到的结果,但万一你不能进行C#
模拟
using System;
using System.Collections.Generic;
using System.Linq;
namespace Foo
{
internal class Bar
{
private static void Main(string[] args)
{
var spool = new Stack<Tuple<int, string, string>>();
//Add anchor elements
AddRowToSpool(spool, new Tuple<int, string, string>(1, "", ""));
AddRowToSpool(spool, new Tuple<int, string, string>(2, "", ""));
AddRowToSpool(spool, new Tuple<int, string, string>(4, "", ""));
while (spool.Count > 0)
{
Tuple<int, string, string> lastRowAdded = spool.Pop();
AddChildRows(lastRowAdded, spool);
}
Console.ReadLine();
}
private static void AddRowToSpool(Stack<Tuple<int, string, string>> spool,
Tuple<int, string, string> row)
{
Console.WriteLine("CategoryId={0}, product_list = {1}",
row.Item1,
row.Item3);
spool.Push(row);
}
private static void AddChildRows(Tuple<int, string, string> lastRowAdded,
Stack<Tuple<int, string, string>> spool)
{
int categoryId = lastRowAdded.Item1;
string productName = lastRowAdded.Item2;
string productList = lastRowAdded.Item3;
string[] products;
switch (categoryId)
{
case 1:
products = new[] {"Changassad"};
break;
case 2:
products = new[]
{
"Aniseed Syrup",
"Chef Anton's Cajun Seasoning",
"Chef Anton's Gumbo Mix "
};
break;
case 4:
products = new[] {"vcbcbvcbvc"};
break;
default:
products = new string[] {};
break;
}
foreach (string product in products.Where(
product => string.Compare(productName, product) < 0))
{
string product_list = string.Format("{0}{1}{2}",
productList,
productList == "" ? "" : ",",
product);
AddRowToSpool(spool,
new Tuple<int, string, string>
(categoryId, product, product_list));
}
}
}
}
返回
CategoryId=1, product_list =
CategoryId=2, product_list =
CategoryId=4, product_list =
CategoryId=4, product_list = vcbcbvcbvc
CategoryId=2, product_list = Aniseed Syrup
CategoryId=2, product_list = Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Chef Anton's Gumbo Mix
CategoryId=2, product_list = Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Gumbo Mix
CategoryId=2, product_list = Aniseed Syrup,Chef Anton's Cajun Seasoning,Chef Anton's Gumbo Mix
CategoryId=1, product_list = Changassad
答案 1 :(得分:3)
页面Recursive Queries Using Common Table Expressions描述了CTE的逻辑:
递归执行的语义如下:
将CTE表达式拆分为锚点和递归成员。
运行锚定成员创建第一个调用或基本结果集(T0)。
以Ti作为输入并以Ti + 1作为输出运行递归成员。
重复步骤3,直到返回空集。
- 醇>
返回结果集。这是T0到Tn的UNION ALL。
但是,这只是逻辑流程。与往常一样,使用SQL,如果结果“相同”,服务器可以自由地重新排序操作,如果结果“相同”,则可以认为重新排序可以更有效地提供结果。
在决定是否重新排序操作时,通常会考虑具有副作用的函数(导致延迟,然后返回GETDATE()
)。
可以重新排序查询的一种显而易见的方法是,它可能决定在完全创建结果集Ti+1
之前开始处理结果集Ti
- 执行此操作可能更有效而不是首先完全构造Ti
,因为新行肯定已经在内存中并且最近已被访问过。