如何编写查询以从历史表中查找原始数据?

时间:2017-11-05 10:10:33

标签: sql-server tsql

我的表格数据:

PolicyId    PolicyEndorsementId OldFireLocationId NewFireLocationId
----------- ------------------- ----------------- -----------------
2167        2846                3298              4460
2167        2846                3299              4461
2167        2848                4460              4462
2167        2848                4461              4463
2167        2849                4462              4464
2167        2849                4463              4465
2167        2849                null              4466
2167        2850                4464              4467
2167        2850                4465              4468
2167        2850                4466              4469

这是一个历史表,其中FireLocationId 3298和3299是原始FireLocationId,它们已被复制到新ID中,分别为4460和4461。同样,4460和4461分别被复制到4462和4463。对于每次复制,这都会继续使用新的PolicyEndorsementId。

此外,对于 null OldFireLocationId的行,OriginalFireLocationId将是NewFireLocationId。

我需要找到原始的FireLocationId,如下面的输出所示:

PolicyId    PolicyEndorsementId OldFireLocationId NewFireLocationId  OrignalFireLocationId
----------- ------------------- ----------------- ----------------- ---------------------
2167        2846                3298              4460                3298
2167        2846                3299              4461                3299
2167        2848                4460              4462                3298
2167        2848                4461              4463                3299
2167        2849                4462              4464                3298
2167        2849                4463              4465                3299
2167        2849                null              4466                4466
2167        2850                4464              4467                3298
2167        2850                4465              4468                3299
2167        2850                4466              4469                4466

2 个答案:

答案 0 :(得分:1)

你可以试试这个:

DECLARE @DataSource TABLE
(
     [PolicyId] INT
    ,[PolicyEndorsementId] INT
    ,[OldFireLocationId] INT 
    ,[NewFireLocationId] INT
);

INSERT INTO @DataSource ([PolicyId], [PolicyEndorsementId], [OldFireLocationId],[NewFireLocationId])
VALUES   (2167, 2846, 3298, 4460)
        ,(2167, 2846, 3299, 4461)
        ,(2167, 2848, 4460, 4462)
        ,(2167, 2848, 4461, 4463)
        ,(2167, 2849, 4462, 4464)
        ,(2167, 2849, 4463, 4465)
        ,(2167, 2849, NULL, 4466)
        ,(2167, 2850, 4464, 4467)
        ,(2167, 2850, 4465, 4468)
        ,(2167, 2850, 4466, 4469);

WITH DataSource AS
(
    SELECT DS1.*
          ,DS1.[NewFireLocationId] AS Original
          ,0 AS [Level]
    FROM @DataSource DS1
    WHERE NOT EXISTS
    (
        SELECT 1
        FROM @DataSource DS2
        WHERE DS2.[OldFireLocationId] = DS1.[NewFireLocationId]
    )
    UNION ALL
    SELECT DS2.*
          ,DS1.[Original]
          ,DS1.[Level] + 1
    FROM DataSource DS1
    INNER JOIN @DataSource DS2
        ON DS1.[OldFireLocationId] = DS2.[NewFireLocationId]
),
TempDataSource AS
(
    SELECT DS1.[Original]   
          ,ISNULL(DS2.[OldFireLocationId], DS2.[NewFireLocationId]) AS [NewValue]
    FROM
    (
        SELECT [Original]
              ,MAX([Level]) AS [Level]
        FROM DataSource
        GROUP BY [Original]
    ) DS1
    INNER JOIN DataSource DS2
        ON DS1.[Original] = DS2.[Original]
        AND DS1.[Level] = DS2.[Level]
)
SELECT A.[PolicyId], A.[PolicyEndorsementId], A.[OldFireLocationId], A.[NewFireLocationId], B.[NewValue]
FROM DataSource A 
INNER JOIN TempDataSource B
    ON A.[Original] = B.[Original]
ORDER BY A.[PolicyId]
        ,A.[PolicyEndorsementId]
        ,IIF(A.[OldFireLocationId] IS NULL, 1, 0)
        ,A.[OldFireLocationId];

它会给你:

enter image description here

由于我们无法根据您的表格结构定义哪个值是第一个和最后一个,我们需要使用两个CTE来计算它。

第一个CTE是递归的。它的第一部分获取所有last值 - 这是值,没有NewFireLocationId值。 CTE的递归部分获取从lastfirst的链接。我们还有两个列 - LevelOriginal。结果如下:

enter image description here

我们将在第二个CTE中使用此重新计算的列来获取OrignalFireLocationId。对于每个Original值,我们得到最大值Level - 从那里我们得到这个值。结果是这样的:

enter image description here

最后一部分只是加入两个CTE结果,只选择我们需要的列。

答案 1 :(得分:1)

以下是使用Recursive CTE

的一种方法
;WITH cte
     AS (SELECT *,
                Isnull(OldFireLocationId, NewFireLocationId) AS parent,
                level = 1
         FROM   Yourtable
         UNION ALL
         SELECT a.*,
                c.parent,
                level =level + 1
         FROM   Yourtable a
                INNER JOIN cte c
                        ON a.PolicyId = c.PolicyId
                           AND a.OldFireLocationId = c.NewFireLocationId)
SELECT TOP 1 WITH ties PolicyId,
                       PolicyEndorsementId,
                       OldFireLocationId,
                       NewFireLocationId,
                       OrignalFireLocationId = parent
FROM   cte 
ORDER  BY Row_number()OVER(partition BY PolicyId, PolicyEndorsementId,Isnull(OldFireLocationId, NewFireLocationId)
              ORDER BY level DESC)