我有一个包含父/子关系的表以及引用自身的日期创建列。 我想显示每个父记录以及节点上最近的“活动”所排序的所有后代。 因此,如果很久以前创建的第1行添加了一个新子项(或者将一个新子项添加到其子项中),那么我希望它位于结果的顶部。
我目前无法正常工作。
我的表格结构如下:
CREATE TABLE [dbo].[Orders](
[OrderId] [int] NOT NULL,
[Orders_OrderId] [int] NULL,
[DateOrdered] datetime)
我编写了以下SQL来提取信息:
WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance,
row_number() over (order by DateOrdered desc) as RN1
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1,
c.RN1
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
)
SELECT * from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
有什么方法可以“聚合”递归选择的结果,这样我就可以选择整个“父”节点的最大日期?
SQLFiddle演示: http://sqlfiddle.com/#!3/ca6cb/11 (记录号1应该是第一个,因为它有一个最近更新过的孩子)
更新 感谢@twrowsell的建议,我有以下查询做工作,但看起来很笨重,并且有一些性能问题,我觉得我不应该有3个CTE来实现这一点。有没有什么方法可以在保留“行号”的同时进行压缩(因为这是用于分页的用户显示)?
WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, 0 as distance, po.DateOrdered, po.orderid as [rootId]
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, c.distance + 1, b2.DateOrdered, c.[rootId]
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
),
mostRecentOrders as (
SELECT *,
MAX(DateOrdered) OVER (PARTITION BY rootId) as [HighestOrderId]
from allOrders
),
pagedOrders as (
select *, dense_rank() over (order by [HighestOrderId] desc) as [PagedRowNumber] from mostRecentOrders)
SELECT * from pagedOrders
where PagedRowNumber between 0 and 2
order by [HighestOrderId] desc
另外,我可以使用MAX(orderid)
,因为orderid是ident,而datecreated在创建后无法在我的场景中更新。
更新了SQLFiddle:http://sqlfiddle.com/#!3/ca6cb/41
答案 0 :(得分:2)
首先,您需要存储“根”订单ID,以便区分订单的不同“树”。完成后,您可以对数据进行汇总和排序。
据我所知,由于您无法在DENSE_RANK()
子句中使用WHERE
,因此至少需要一个CTE来构建树,第二个需要进行排名。
以下查询使用临时表来存储树。查询从树中选择两次,一次用于行,第二次用于排名。如果我使用CTE来存储树,则必须将它构建两次,因为CTE基本上只是一个可重用的子查询(它将在每次使用时重建)。使用临时表确保我只需要构建一次。
这是SQL:
DECLARE @Offset INT = 0;
DECLARE @Fetch INT = 2;
-- Create the Order Trees
WITH OrderTree AS (
SELECT po.orderid AS RootOrderID,
po.orderid,
po.Orders_OrderId,
po.DateOrdered,
0 AS distance
FROM orders po WHERE po.Orders_OrderId IS NULL
UNION/**/ALL
SELECT parent.RootOrderID,
child.orderid,
child.Orders_OrderId,
child.DateOrdered,
parent.distance + 1 AS distance
FROM orders child
INNER JOIN OrderTree parent
ON child.Orders_OrderId = parent.orderid
)
SELECT *
INTO #OrderTree
FROM OrderTree;
-- Rank the order trees by MAX(DateOrdered)
WITH
Rankings AS (
SELECT RootOrderID,
MAX(DateOrdered) AS MaxDate,
ROW_NUMBER() OVER(ORDER BY MAX(DateOrdered) DESC, RootOrderID ASC) AS Rank
FROM #OrderTree
GROUP BY RootOrderID
)
-- Get the next @Fetch trees, starting at rank @Offset+1
SELECT TREE.*,
R.MaxDate,
R.Rank
FROM Rankings R
INNER JOIN #OrderTree TREE
ON R.RootOrderID = TREE.RootOrderID
WHERE R.Rank BETWEEN @Offset+1 AND (@Fetch+@Offset)
ORDER BY R.Rank ASC, TREE.distance ASC;
注意:/**/
和UNION
之间的ALL
是this issue的解决方法。
我使用我在数据库中的现有表中的数据构建了自己的“订单”表,并针对问题中的3-CTE查询进行了一些基准测试。这在大量数据池中略胜一筹(117棵树,总订单数为37215,最大深度为11)。我通过在STATISTICS IO
和STATISTICS TIME
打开的情况下运行每个查询进行基准测试,在每次运行之前清除缓存和缓冲区。
以下是两个查询的结果,以及两者共享的递归CTE的结果:
╔════════════╦══════════╦════════════╦══════════════╗
║ Query ║ CPU Time ║ Scan Count ║ Logical Reads║
╠════════════╬══════════╬════════════╬══════════════╣
║ Tree CTE ║ 24211ms ║ 4 ║ 1116243 ║
╟────────────╫──────────╫────────────╫──────────────╢
║ 3-CTE ║ 24789ms ║ 7 ║ 1192221 ║
║ Temp Table ║ 24384ms ║ 6 ║ 1116549 ║
╚════════════╩══════════╩════════════╩══════════════╝
这两个查询的大部分都是递归的订单树CTE。删除递归CTE的共享成本会产生以下结果:
╔════════════╦══════════╦════════════╦══════════════╗
║ Query ║ CPU Time ║ Scan Count ║ Logical Reads║
╠════════════╬══════════╬════════════╬══════════════╣
║ 3-CTE ║ 578ms ║ 3 ║ 75978 ║
║ Temp Table ║ 173ms ║ 2 ║ 306 ║
╚════════════╩══════════╩════════════╩══════════════╝
根据这些结果,我强烈建议您在订单表中添加RootOrderID列,以避免使用可能非常昂贵的递归CTE。
答案 1 :(得分:1)
在外部选择工作中的OVER子句中使用MAX on DateOrdered。
WITH allOrders AS
(
SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance,
row_number() over (order by DateOrdered desc) as RN1
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1,
c.RN1
FROM orders b2
INNER JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
)
SELECT *, MAX(DateOrdered) OVER (PARTITION BY Orders_OrderId) from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
修改强> 对不起,我第一次误解了你的要求。看起来您想要通过RN1字段而不是Orders_OrderId对结果进行分区,因此您的外部选择将类似于..
SELECT MAX(DateOrdered) OVER (PARTITION BY RN1 ),* from allOrders
where RN1 between 0 and 2
order by rn1 asc, distance asc
答案 2 :(得分:1)
看看以下内容:
;WITH allOrders AS
(SELECT po.orderid, po.Orders_OrderId, po.DateOrdered, 0 as distance, po.orderid as [parentOrder]
FROM orders po WHERE po.Orders_OrderId is null
UNION ALL
SELECT b2.orderid ,b2.Orders_OrderId, b2.DateOrdered, c.distance + 1, c.[parentOrder]
FROM orders b2
INNER JOIN allOrders c ON b2.Orders_OrderId = c.orderid
)
SELECT a.OrderId
,a.Orders_OrderId
,a.DateOrdered
,top1.DateOrdered as HIghestDate
,a.distance
,a.parentOrder
FROM allOrders a
INNER JOIN (SELECT TOP 2 parentOrder, MAX(DateOrdered)as highestdates FROM allOrders GROUP BY parentOrder ORDER BY MAX(DateOrdered)DESC)b on a.parentOrder=b.parentOrder
OUTER APPLY (SELECT TOP 1 parentOrder, DateOrdered FROM allOrders top1 WHERE a.parentOrder=top1.parentOrder ORDER BY top1.DateOrdered DESC)top1
答案 3 :(得分:1)
我很难理解您的确切全部需求,包括分页情况。您可以为您提供的样本提供预期的结果集,这将更容易检查。
无论如何,看来你的主要困难在于:
有什么方法可以“聚合”递归的结果 选择,以便我可以选择整个日期的最大日期 '父'节点?
...这可以通过递归CTE和APPLY轻松完成。
我不确定你到底想要什么,所以我做了这两个小提琴:
SQL Fiddle 1 - 这里所有的孩子都在一起根据根顺序,即,顺序3是父母的(顺序2)父母(顺序1)。
SQL Fiddle 2 - 这里的孩子与他们的直接父母一起分组,而且父母也成了根,所以第2顺序没有与父母(顺序1)一起到达顶部。
我认为你会对第一个进行一些修改。
同样,在这样的问题中提供您期望的结果非常重要,否则您将获得大量的试错法答案。
答案 4 :(得分:0)
我能够获得与您在更新的小提琴中所描述的相同的结果集。我作为pedro的交叉应用的一部分达到了我的解决方案..只是根据我自己的经验来说,应用是非常糟糕的。最终,它演变为目前的状态,主表上的左连接,其子查询具有您请求的分页。
请找小提琴>>here (SQLFiddle)
另外,附上的代码:
WITH allOrders AS (
--anchor
SELECT po.orderid
, po.Orders_OrderId
, 0 AS distance
, po.DateOrdered
, po.orderid AS [rootId]
FROM orders po
WHERE po.Orders_OrderId IS NULL
--recursive
UNION ALL
SELECT b2.orderid
, b2.Orders_OrderId
, c.distance + 1
, b2.DateOrdered
, c.[rootId]
FROM orders b2
JOIN allOrders c
ON b2.Orders_OrderId = c.orderid
)
SELECT a.*
, b.max_orderdate
, RN1
FROM allOrders a
LEFT JOIN (SELECT DISTINCT rootid, max(DateOrdered) max_orderdate
, row_number() over (order by max(dateordered) desc) as RN1
FROM allOrders GROUP BY rootid) b
ON a.rootid = b.rootid
where RN1 between 0 and 2
ORDER BY b.max_orderdate DESC, a.rootid, a.orders_orderid, a.orderid