我正试图获得一个“血统”或类似的东西,以及关于第一个和最后一个链接的信息(至少;一切都会好),在一个表之间有自引用链接的表“替换”和已替换它们的行。该表的结构如下:
myStruct.config[0] |= 0x1F00 | 0);
我坚持这种结构。 :-)它有点双重关联(是的,它有点愚蠢):每一行都有一个唯一的CREATE TABLE Thing (
Id INT PRIMARY KEY,
TStamp DATETIME,
Replaces INT NULL,
ReplacedBy INT NULL
);
,然后被另一行“替换”的行将有一个非Id
} NULL
给出替换行的ReplacedBy
,替换行也会有一个链接回到它在Id
中替换的内容。因此,如果我们愿意,我们可以使用Replaces
或Replaces
(或两者)。
以下是一些示例数据:
ReplacedBy
因此1被11替换,2被12替换,12替换为22。
我希望以合理的方式从此表中获取每个链接链的以下信息:
...按照应用于链中 last 行的日期范围进行过滤。
在一个理想的宇宙中,我会得到这样的东西:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+ | FirstId | LastId | Id | Links | TStamp | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+ | 1 | 11 | 1 | 2 | 2017−01−01 | | 1 | 11 | 11 | 2 | 2017−01−11 | | 2 | 22 | 2 | 3 | 2017−01−02 | | 2 | 22 | 12 | 3 | 2017−01−12 | | 2 | 22 | 22 | 3 | 2017−01−22 | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−−−−−−+
到目前为止,我有这个查询,我可以进行后期处理以获得上述内容:
INSERT INTO Thing
(Id, TStamp, Replaces, ReplacedBy)
VALUES
(1, '2017-01-01', NULL, 11),
(2, '2017-01-02', NULL, 12),
(3, '2017-01-03', NULL, NULL),
(4, '2017-01-04', NULL, NULL),
(11, '2017-01-11', 1, NULL),
(12, '2017-01-12', 2, 22),
(22, '2017-01-22', 12, NULL);
这让我:
+−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+ | Id | TStamp | Replaces | ReplacedBy | Depth | +−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+ | 1 | 2017−01−01 | NULL | 11 | 0 | | 2 | 2017−01−02 | NULL | 12 | 0 | | 11 | 2017−01−11 | 1 | NULL | 1 | | 12 | 2017−01−12 | 2 | 12 | 0 | | 12 | 2017−01−12 | 2 | 12 | 1 | | 22 | 2017−01−13 | 12 | NULL | 1 | | 22 | 2017−01−13 | 12 | NULL | 2 | +−−−−+−−−−−−−−−−−−+−−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−+
我可以使用这样的东西来计算(例如)每个链的最后一行:
WITH Data AS ( SELECT Id, Replaces, ReplacedBy, 0 AS Depth FROM Thing UNION ALL SELECT Thing.Id, Thing.Replaces, Thing.ReplacedBy, Depth + 1 FROM Data JOIN Thing ON Thing.Replaces = Data.Id ), MaxData AS ( SELECT Data.Id, Data.Depth FROM Data JOIN ( SELECT Id, MAX(Depth) AS MaxDepth FROM Data GROUP BY Id ) j ON data.Id = j.Id AND Data.Depth = j.MaxDepth WHERE Depth > 0 ) SELECT * FROM MaxData ORDER BY Id;
......这给了我:
+−−−−+−−−−−−−+ | Id | Depth | +−−−−+−−−−−−−+ | 11 | 1 | | 12 | 1 | | 22 | 2 | +−−−−+−−−−−−−+
...但是我已经失去了起点和点。
我有强烈的感觉我错过了一些非常直接的东西 - 但很聪明 - 这会让我在很大程度上得到这个问题而不是后期处理,某种加入“min”和“max”查询(但不像我上面的那个)。它会是什么?
该表在WITH Data AS (
SELECT Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
UNION ALL
SELECT Thing.Id, Thing.TStamp, Thing.Replaces, Thing.ReplacedBy, Depth + 1
FROM Data
JOIN Thing
ON Thing.Replaces = Data.Id
)
SELECT *
FROM Data
WHERE ReplacedBy IS NOT NULL OR Depth > 0
ORDER BY
Id, Depth;
或Replaces
上没有任何索引,但我们可以添加任何所需的索引。该表只是很少使用(大约300k行,每天可能只有几百次更新/插入)。
我仅限于SQL Server 2008功能。
答案 0 :(得分:3)
受到Gordon Linoff's answer和HABO's comment的启发,突出了戈登正在做的事情,这很重要,我:
FIRST_VALUE
函数,将其替换为数据“概述”查询中的CROSS JOIN
Links
计数t
中WHERE NOT EXISTS (SELECT 1 FROM Thing t2 WHERE t2.ReplacedBy = t.id)
的依赖,其中(最后在SQL Server 2008上)没有绑定任何内容下面,我还添加了问题中提到的日期过滤
...按照应用于链中最后一行的日期范围进行过滤。
...戈登完全没有报道,并且改变了我们的方法,但只是在时间的箭头方面。
所以,首先,没有日期标准,坚持非常接近戈登的回答:
WITH Data AS (
SELECT Id AS FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
UNION ALL
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d
JOIN Thing t ON t.Replaces = d.Id
),
Overview AS (
SELECT FirstId, MAX(Id) AS LastId, COUNT(*) AS Links
FROM Data
GROUP BY
FirstId
)
SELECT d.FirstId, o.LastId, d.Id, o.Links, d.Depth, d.TStamp
FROM Data d
CROSS APPLY (
SELECT LastId, Links
FROM Overview
WHERE FirstId = d.FirstId
) o
ORDER BY
d.FirstId, d.Depth
;
关键部分是将种子Id
抓取为FirstId
:
SELECT Id AS FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
然后通过递归连接的结果传播它:
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d
JOIN Thing t ON t.Replaces = d.Id
只需将其添加到我的原始查询中即可获得我想要的大部分内容。然后我们添加第二个查询以获取每个LastId
的{{1}}(Gordon在分区上将其作为FirstId
,但我不能在SQL Server 2008中执行此操作)并使用概述查询还可以让我获取链接数量。我们在FIRST_VALUE
值的基础上交叉应用它,以获得我想要的整体结果。
上面的查询为示例数据返回以下内容:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | FirstId | LastId | Id | Links | Depth | TStamp | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | 1 | 11 | 1 | 2 | 0 | 2017-01-01 | | 1 | 11 | 11 | 2 | 1 | 2017-01-11 | | 2 | 22 | 2 | 3 | 0 | 2017-01-02 | | 2 | 22 | 12 | 3 | 1 | 2017-01-12 | | 2 | 22 | 22 | 3 | 2 | 2017-01-13 | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
...例如,正是我想要的,加上FirstId
如果我想要(所以我知道中间链接的顺序)。
如果我们想要包含从未替换过的行,我们只需要更改
Depth
到
WHERE Replaces IS NULL AND ReplacedBy IS NOT NULL
给我们:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | FirstId | LastId | Id | Links | Depth | TStamp | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | 1 | 11 | 1 | 2 | 0 | 2017-01-01 | | 1 | 11 | 11 | 2 | 1 | 2017-01-11 | | 2 | 22 | 2 | 3 | 0 | 2017-01-02 | | 2 | 22 | 12 | 3 | 1 | 2017-01-12 | | 2 | 22 | 22 | 3 | 2 | 2017-01-13 | | 3 | 3 | 3 | 1 | 0 | 2017-01-03 | | 4 | 4 | 4 | 1 | 0 | 2017-01-04 | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
但是我们忽略了问题所要求的日期标准:
...按照应用于链中最后一行的日期范围进行过滤。
要在不构建大量临时结果集的情况下执行此操作,我们必须向后工作:我们需要选择结尾,而不是选择起始点(链中的第一个条目WHERE Replaces IS NULL
)。 指向(链中的最后一个条目,Replaces IS NULL
),然后通过链反转我们的逻辑。这主要是因为:
ReplacedBy IS NULL
FirstId
LastId
交换Replaces
(方便桌子同时使用!)ReplacedBy
获取链中的第一个ID,而不是MIN
来获取最后一个MAX
而不是d.Depth - 1
d.Depth + 1
修复Depth
,以获得那些值为0 =第一个链接而不是一些变化的负数的漂亮值:Links
所有这些都给了我们:
o.Links + d.Depth - 1 AS Depth
例如,如果我们使用
WITH Data AS (
SELECT Id AS LastId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing
WHERE ReplacedBy IS NULL AND Replaces IS NOT NULL
-- Filtering by date of last entry would go here
UNION ALL
SELECT d.LastId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth - 1
FROM Data d
JOIN Thing t ON t.ReplacedBy = d.Id
),
Overview AS (
SELECT LastId, MIN(Id) AS FirstId, COUNT(*) AS Links
FROM Data
GROUP BY
LastId
)
SELECT o.FirstId, d.LastId, d.Id, o.Links, o.Links + d.Depth - 1 AS Depth, d.TStamp
FROM Data d
CROSS APPLY (
SELECT FirstId, Links
FROM Overview
WHERE LastId = d.LastId
) o
ORDER BY
o.FirstId, d.Depth
;
我在哪里
AND TStamp BETWEEN '2017-01-12' AND '2017-02-01'
以上,我们的样本数据得到了这个结果:
+−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | FirstId | LastId | Id | Links | Depth | TStamp | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+ | 2 | 22 | 2 | 3 | 0 | 2017−01−02 | | 2 | 22 | 12 | 3 | 1 | 2017−01−12 | | 2 | 22 | 22 | 3 | 2 | 2017−01−13 | +−−−−−−−−−+−−−−−−−−+−−−−+−−−−−−−+−−−−−−−+−−−−−−−−−−−−+
...因为-- Filtering by date of last entry would go here
链的最后一个链接超出了日期范围,所以我们不包括它。
答案 1 :(得分:2)
这有点棘手。安排CTE从每个列表的开头开始。这使后续处理更容易:
WITH Data AS (
SELECT Id as FirstId, Id, TStamp, Replaces, ReplacedBy, 0 AS Depth
FROM Thing t
WHERE NOT EXISTS (SELECT 1 FROM Thing t2 WHERE t2.ReplacedBy = t.id)
UNION ALL
SELECT d.FirstId, t.Id, t.TStamp, t.Replaces, t.ReplacedBy, d.Depth + 1
FROM Data d JOIN
Thing t
ON t.Replaces = d.Id
)
SELECT d.*,
FIRST_VALUE(id) OVER (PARTITION BY FirstId ORDER BY Depth DESC) as LastId
FROM Data d;
然后,您可以使用FIRST_VALUE()
反向排序来获取链中的最后一个值。
这将返回没有链接的链。您可以添加过滤器以删除它们。