如何编写查询以从数据快照中提取单个更改?

时间:2011-10-12 14:18:23

标签: sql sql-server sql-server-2005

我需要创建一个进程,该进程将从表中提取更改,其中每行是另一个表中行的快照。现实世界的问题涉及许多具有许多字段的表,但作为一个简单的例子,假设我有以下快照数据:

Sequence    DateTaken      ID       Field1    Field2
--------    -----------    ----     ------    ------
       1    '2011-01-01'      1     'Red'          2
       2    '2011-01-01'      2     'Blue'        10
       3    '2011-02-01'      1     'Green'        2
       4    '2011-03-01'      1     'Green'        3
       5    '2011-03-01'      2     'Purple'       2
       6    '2011-04-01'      1     'Yellow'       2

SequenceDateTaken字段直接与快照表本身相关。 ID字段是源表的主键,Field1Field2是同一(源)表中的其他字段。

我可以通过这样的查询找到解决方案:

WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index)
AS
(
    SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index
    FROM #Snapshots
)
SELECT
      c.DateTaken, c.ID
    , c.Field1 Field1_Current, p.Field1 Field1_Previous, CASE WHEN c.Field1 = p.Field1 THEN 0 ELSE 1 END Field1_Changed
    , c.Field2 Field2_Current, p.Field2 Field2_Previous, CASE WHEN c.Field2 = p.Field2 THEN 0 ELSE 1 END Field2_Changed
FROM Snapshots c
JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index
ORDER BY c.Sequence DESC

上述查询将识别从一个快照到下一个快照的变化,但它仍然不是我需要的形式。输出中的每一行可能包含多个更改。在一天结束时,每次更改需要一行,以确定哪些字段已更改,以及其先前/当前值。实际上没有更改的字段需要从最终输出中排除。因此,如果上面的查询输出是这样的:

DateTaken   ID  Field1_Current  Field1_Previous  Field1_Changed  Field2_Current  Field2_Previous  Field2_Changed
----------  --  --------------  ---------------  --------------  --------------  ---------------  --------------
2011-04-01  1   Yellow          Green            1               2               3                1
2011-02-01  1   Green           Red              1               2               2                0

我需要把它变成这样的东西:

DateTaken   ID  Field    Previous   Current
----------  --  -------  --------   ---------
2011-04-01  1   Field1   Green      Yellow
2011-04-01  1   Field2   3          2
2011-02-01  1   Field1   Red        Green

我以为我可以用UNPIVOT到达那里,但我无法做到这一点。我认为任何涉及游标或类似的解决方案都是绝对的最后手段。

非常感谢任何建议。

2 个答案:

答案 0 :(得分:3)

这是一个使用UNPIVOT的工作示例。这是基于我对我的问题Better way to Partially UNPIVOT in Pairs in SQL

的回答

这有一些不错的功能。

  1. 添加其他字段很简单。只需向SELECT和UNPIVOT子句添加值即可。您不必添加其他UNION子句

  2. 无论添加多少字段,where子句WHERE curr.value <> prev.value都不会更改。

  3. 表现速度惊人。

  4. 如果您需要,可以移植到当前版本的Oracle


  5. SQL

    Declare @Snapshots as table(
    Sequence int,
    DateTaken      datetime,
    [id] int,
    field1 varchar(20),
    field2 int)
    
    
    
    INSERT INTO @Snapshots VALUES 
    
          (1,    '2011-01-01',      1,     'Red',          2),
          (2,    '2011-01-01',      2,     'Blue',        10),
          (3,    '2011-02-01',      1,     'Green',        2),
          (4,    '2011-03-01',      1,     'Green' ,       3),
          (5,    '2011-03-01',      2,     'Purple',       2),
          (6,    '2011-04-01',      1,     'Yellow',       2)
    
    ;WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index)
    AS
    (
        SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index
        FROM @Snapshots
    )
    ,  data as(
    SELECT
         c._Index
        , c.DateTaken
        ,  c.ID
        , cast(c.Field1  as varchar(max)) Field1
        , cast(p.Field1  as varchar(max))Field1_Previous
        , cast(c.Field2   as varchar(max))Field2
        , cast(p.Field2  as varchar(max)) Field2_Previous 
    
    
    FROM Snapshots c
    JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index
    )
    
    
    , fieldsToRows 
         AS (SELECT DateTaken, 
                    id,
                    _Index,
                    value,
                    field
    
             FROM   data p UNPIVOT (value FOR field IN (field1, field1_previous, 
                                                            field2, field2_previous) ) 
                    AS unpvt
            ) 
    SELECT 
        curr.DateTaken,
        curr.ID,
        curr.field,
        prev.value previous,
        curr.value 'current'
    
    FROM 
            fieldsToRows curr 
            INNER  JOIN  fieldsToRows prev
            ON curr.ID = prev.id
                AND curr._Index = prev._Index 
                AND curr.field + '_Previous' = prev.field
    WHERE 
        curr.value <> prev.value
    

    输出

    DateTaken               ID          field     previous current
    ----------------------- ----------- --------- -------- -------
    2011-02-01 00:00:00.000 1           Field1    Red      Green
    2011-03-01 00:00:00.000 1           Field2    2        3
    2011-04-01 00:00:00.000 1           Field1    Green    Yellow
    2011-04-01 00:00:00.000 1           Field2    3        2
    2011-03-01 00:00:00.000 2           Field1    Blue     Purple
    2011-03-01 00:00:00.000 2           Field2    10       2
    

答案 1 :(得分:1)

WITH Snapshots (Sequence, DateTaken, ID, Field, FieldValue, _Index) AS
(
    SELECT
        Sequence,
        DateTaken,
        ID,
        'Field1' AS Field
        CAST(Field1 AS VARCHAR(100)) AS FieldValue,  -- Find an appropriate length
        ROW_NUMBER() OVER (ORDER BY ID, Sequence)
    FROM
        #Snapshots
    UNION ALL
    SELECT
        Sequence,
        DateTaken,
        ID,
        'Field2' AS Field
        CAST(Field2 AS VARCHAR(100)) AS FieldValue,  -- Find an appropriate length
        ROW_NUMBER() OVER (ORDER BY ID, Sequence)
    FROM
        #Snapshots
)
SELECT
    S1.DateTaken,
    S1.ID,
    S1.Field,
    S1.FieldValue AS Previous,
    S2.FieldValue As New   -- Not necessarily "Current"
FROM
    Snapshots S1
INNER JOIN Snapshots S2 ON
    S2.ID = S1.ID AND
    S2.Field = S1.Field AND
    S2._Index = S1._Index + 1 AND
    S2.FieldValue <> S1.FieldValue    -- Might need to handle NULL values