Question

我有一个SQL Server数据库，其中包含一些审计记录，显示对第三方数据库（OpenEdge）的更改。我无法控制审计数据的结构，也无法控制第三方数据库审计数据更改的方式。因此，我留下了以下数据...

如果您按照前五行进行操作，则可以看到它们都属于TransId 1532102（表示数据库事务），其中TransSeq表示单个事务中的数据库操作。

在列前缀New中，审核更改可见。如果值为NULL，则不会对该字段进行任何更改。

查看数据，您可以看到TransId = 1532102，其中PrimaryIdentifier从2更改为-2（第1行），然后从-2更改为3（第3行），然后从3更改为4（第4行），最后从4到5（第5行）。您可能还注意到，当PrimaryIdentifier从3更改为4时，SecondaryIdentifier会从＆＃39; abcd＆＃39;到了＆＃39; efgh＆＃39; （第4行）。因此，这些多个更改实际上只发生在单个源记录上。因此，记住第1,3,4和1行。 5都可以压缩成一行（见下文）

最终TransId 1532102只有两个记录变化..

我需要将这些更改转换为目标数据库上的单个UPDATE语句。为了做到这一点，我需要确保我有一条记录显示前后值。

因此，鉴于此处提供的源数据，我需要生成以下数据集。

我可以使用哪些查询结构来实现此目的？我在考虑递归CTE或者使用Hierarchical结构？最终我需要这个以尽可能好的表现，所以我想在这里提出问题，以防我没有考虑所有可能的方法。

欢迎思考，这是一个示例数据的脚本

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT  @TestTable
        SELECT 128, 1532102, 0,  2, 'abcd',   -2,   NULL, NULL, 'test data'
UNION   SELECT 128, 1532102, 1,  3, 'abcd',    2,   NULL, NULL, NULL
UNION   SELECT 128, 1532102, 2, -2, 'abcd',    3,   NULL, NULL, NULL
UNION   SELECT 128, 1532102, 3,  3, 'abcd',    4, 'efgh', NULL, NULL
UNION   SELECT 128, 1532102, 4,  4, 'efgh',    5,   NULL,    2, NULL
UNION   SELECT 128, 1532102, 5,  5, 'efgh', NULL, 'ghfi', NULL, NULL
UNION   SELECT 128, 1532106, 0,  3, 'abcd',   -3,   NULL, NULL, NULL
UNION   SELECT 128, 1532106, 1,  4, 'abcd',    3,   NULL, NULL, NULL
UNION   SELECT 128, 1532106, 2, -3, 'abcd',    4,   NULL, NULL, NULL
UNION   SELECT 128, 1532110, 0,  4, 'abcd',   -4,   NULL, NULL, NULL
UNION   SELECT 128, 1532110, 1,  5, 'abcd',    4,   NULL, NULL, NULL
UNION   SELECT 128, 1532110, 2, -4, 'abcd',    5,   NULL, NULL, NULL
UNION   SELECT 128, 1532114, 0,  5, 'abcd',   -5,   NULL, NULL, NULL
UNION   SELECT 128, 1532114, 1,  4, 'abcd',    5,   NULL,    1, NULL
UNION   SELECT 128, 1532114, 2, -5, 'abcd',    4,   NULL, NULL, 'some more test data'

SELECT  *
FROM    @TestTable

修改我实际上无法编写任何成功跟踪标识符更改的查询。任何人都可以提供帮助 - 我需要一个跟踪PrimaryIdentifier值变化的查询，并最终为每个跟踪提供单个记录，包括起始值和结束值。

编辑2： 这是一个删除的答案，表明在压缩时无法更新密钥标识符，而是我应该逐步完成更改。我认为将我的评论添加到问题的进一步信息是很有价值的。

由于生成审计记录的数量，我需要压缩数据集;由于源DBMS进行更改的方式，其中大多数是不必要的。我需要减少数据集，我需要跟踪关键标识符更改。在更新语句期间，可以在不更改ID更改的情况下进行更新 - 请参阅this example。

Answer 1

这是第二次尝试产生最初要求的输出。这一次使用了一堆CTE：s。

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))

INSERT  @TestTable
        SELECT 128, 1532102, 0,  2, 'abcd', -2, NULL,   NULL,   'test data'
UNION   SELECT 128, 1532102, 1,  3, 'abcd',  2, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 2, -2, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 3,  3, 'abcd',  4, 'efgh', NULL,   NULL
UNION   SELECT 128, 1532102, 4,  4, 'efgh',  5, NULL,   2,      NULL
UNION   SELECT 128, 1532106, 0,  3, 'abcd', -3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 1,  4, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 2, -3, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 0,  4, 'abcd', -4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 1,  5, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 2, -4, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 3,  5, 'abcd',  6, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 4,  6, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 0,  5, 'abcd', -5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 1,  4, 'abcd',  5, NULL,   1,      NULL
UNION   SELECT 128, 1532114, 2, -5, 'abcd',  4, NULL,   NULL,   'some more test data'


;with baseCTE as (
    select SyncId, TransId, TransSeq, PrimaryIdentifier, SecondaryIdentifier,
            isnull(NewPrimaryIdentifier, PrimaryIdentifier) as NewPrimaryIdentifier,
            isnull(NewSecondaryIdentifier, SecondaryIdentifier) as NewSecondaryIdentifier,
            NewLevel, NewValue
    from @TestTable
),
syncTransEntryPointsCte as (
    select *
    from baseCTE b
    where not exists(
        select *
        from baseCTE subb
        where b.SyncId = subb.SyncId
            and b.TransId = subb.TransId
            and b.PrimaryIdentifier = subb.NewPrimaryIdentifier
            and b.SecondaryIdentifier = subb.NewSecondaryIdentifier
            and b.TransSeq > subb.TransSeq
    )
)
, recursiveBaseCte as (
    select *, 0 as lev, TransSeq as OrigTransSec from syncTransEntryPointsCte

    union all 

    select 
        c.SyncId, c.TransId, c.TransSeq, p.PrimaryIdentifier, p.SecondaryIdentifier, c.NewPrimaryIdentifier, c.NewSecondaryIdentifier, isnull(c.NewLevel, p.NewLevel), isnull(c.NewValue, p.NewValue),
        p.lev + 1,
        p.OrigTransSec
    from baseCTE c
        join recursiveBaseCte as p on (
            c.SyncId = p.SyncId and c.TransId = p.TransId and c.PrimaryIdentifier = p.NewPrimaryIdentifier and c.SecondaryIdentifier = p.NewSecondaryIdentifier and c.TransSeq > p.TransSeq
        )
)
select r.SyncId, r.TransId, r.OrigTransSec as TransSec, 
    r.PrimaryIdentifier, r.SecondaryIdentifier, 
    nullif(r.NewPrimaryIdentifier, r.PrimaryIdentifier) as NewPrimaryIdentifier,
    nullif(r.NewSecondaryIdentifier, r.SecondaryIdentifier) as NewSecondaryIdentifier,
    r.NewLevel, r.NewValue
from recursiveBaseCte r
    join (
        select SyncId, TransId, PrimaryIdentifier, SecondaryIdentifier, max(lev) as mlev 
        from recursiveBaseCte 
        group by SyncId, TransId, PrimaryIdentifier, SecondaryIdentifier
    ) as selectForOutput on 
        r.SyncId = selectForOutput.SyncId
        and r.TransId = selectForOutput.TransId
        and r.PrimaryIdentifier = selectForOutput.PrimaryIdentifier
        and r.SecondaryIdentifier = selectForOutput.SecondaryIdentifier
        and r.lev = selectForOutput.mlev
order by 1,2,3

CTE方法是否比基于光标的方法快得多难以猜测。我建议你在有问题的服务器没有负载的情况下，在合适的时间测试运行。

<强>更新

该脚本首先声明baseCTE，它仅用于确保每行中NewPrimaryIdentifier和NewSecondaryIdentifier都有值，即使其中一个或两个都未更改在更新中。这使得之后的所有内容变得更容易，因为我们可以在特定事务中加入相同组合的下一行。

syncTransEntryPointCte依次使用baseCTE查找一个事务中的所有行，这些行之前没有同一事务中的另一行。

recursiveBaseCte然后使用前面的两个CTE：s递归查找行和聚合更改。最后的查询然后使用它来产生最终输出。

如果您可以设法在一个更新语句中对一个压缩事务执行更新，则输出应该可用于更新源表的旧副本。如果，正如我最初假设的那样，您尝试为精简审计输出中的每一行构建一个更新语句，它将无法工作。

最后，强制性免责声明：这似乎与您在问题中提供的测试数据一起使用。我不能保证它适用于真实的东西，所以请谨慎使用。

Answer 2

我认为是 1）(PrimaryIdentifier, SecondaryIdentifier)是目标表的PK，
2）审计表中的每个事务处理都使目标表处于一致状态。因此，使用case在每个事务的单个语句中更新PK将运行正常：

declare @t table (id int primary key, old int);
insert @t(id, old) values (4,4),(5,5);
update @t set id = case id 
     when 4 then 5 
     when 5 then 4 end;
select * from @t;

计划是 1.简化交易 2.将更新sql生成到临时表中。然后，您可以从临时表中运行所有或选定的项目。每个项目都是

形式

UPDATE myTable SET 
         PrimaryIdentifier = CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 5 
                                  WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN 2 END,  
        SecondaryIdentifier = CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 'efgh' 
                                   WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN 'abcd' END , 
        Level= CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 2 
                    WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN  Level  END , 
        Value= CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 'test data' 
                    WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN  Value  END
WHERE 1=2 OR (PrimaryIdentifier=2 AND SecondaryIdentifier='abcd') 
          OR (PrimaryIdentifier=3 AND SecondaryIdentifier='abcd')

查询

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT  @TestTable
        SELECT 128, 1532102, 0,  2, 'abcd', -2, NULL,   NULL,   'test data'
UNION   SELECT 128, 1532102, 1,  3, 'abcd',  2, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 2, -2, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 3,  3, 'abcd',  4, 'efgh', NULL,   NULL
UNION   SELECT 128, 1532102, 4,  4, 'efgh',  5, NULL,   2,      NULL
UNION   SELECT 128, 1532106, 0,  3, 'abcd', -3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 1,  4, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 2, -3, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 0,  4, 'abcd', -4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 1,  5, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 2, -4, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 3,  5, 'abcd',  6, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 4,  6, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 0,  5, 'abcd', -5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 1,  4, 'abcd',  5, NULL,   1,      NULL
UNION   SELECT 128, 1532114, 2, -5, 'abcd',  4, NULL,   NULL,   'some more test data'
;
WITH root AS (
    -- Top parent updates within transactions
    SELECT SyncId, TransId, TransSeq, PrimaryIdentifier AS rPrimaryIdentifier, SecondaryIdentifier AS rSecondaryIdentifier, 
    NewPrimaryIdentifier, 
    coalesce(NewSecondaryIdentifier, SecondaryIdentifier) AS NewSecondaryIdentifier,
    newLevel, NewValue
    FROM  @TestTable t
    WHERE NOT EXISTS (SELECT 1 
                   FROM  @TestTable t2 
                   WHERE t2.SyncId=t.SyncId AND t2.TransId = t.TransId
                       AND t2.TransSeq < t.TransSeq 
                       AND t.PrimaryIdentifier = t2.NewPrimaryIdentifier
                       AND t.SecondaryIdentifier = coalesce(t2.NewSecondaryIdentifier, t2.SecondaryIdentifier) 
                   )
    -- recursion to track the chain of updates
    UNION ALL
    SELECT root.SyncId, root.TransId, t.TransSeq, rPrimaryIdentifier, rSecondaryIdentifier,
         t.NewPrimaryIdentifier,
         coalesce(t.NewSecondaryIdentifier, root.NewSecondaryIdentifier),
         coalesce(root.NewLevel, t.NewLevel), coalesce(root.NewValue, t.NewValue)
    FROM root 
    JOIN @TestTable t ON root.SyncId=t.SyncId AND root.TransId = t.TransId
                       AND root.TransSeq < t.TransSeq 
                       AND t.PrimaryIdentifier = root.NewPrimaryIdentifier
                       AND t.SecondaryIdentifier = root.NewSecondaryIdentifier

)
,condensed as (
    -- last update in the chain
    SELECT TOP(1) WITH TIES *  
    FROM root
    ORDER BY row_number() over (partition by SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier 
                                order by TransSeq desc)
)
-- generate sql
SELECT SyncId, TransId, sql = 'UPDATE myTable SET PrimaryIdentifier = CASE'

    + (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20)) 
             +' AND SecondaryIdentifier=''' + rSecondaryIdentifier 
             +''' THEN ' + CAST(NewPrimaryIdentifier as varchar(20))             
        FROM condensed c2 
        WHERE c1.SyncId = c2.SyncId AND  c1.TransId= c2.TransId
        FOR XML PATH('') ) 
    + ' END,  SecondaryIdentifier = CASE'
    + (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20)) 
             +' AND SecondaryIdentifier=''' + rSecondaryIdentifier
             +''' THEN ''' + NewSecondaryIdentifier + ''''
        FROM condensed c2 
        WHERE c1.SyncId = c2.SyncId AND  c1.TransId= c2.TransId
        FOR XML PATH('') )
    + ' END , Level= CASE'
    + (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20)) 
             +' AND SecondaryIdentifier=''' + rSecondaryIdentifier
             +''' THEN ' 
             + CASE WHEN NewLevel IS NULL THEN ' Level ' ELSE CAST(NewLevel  as varchar(20)) END 
        FROM condensed c2 
        WHERE c1.SyncId = c2.SyncId AND  c1.TransId= c2.TransId
        FOR XML PATH('') )
    + ' END , Value= CASE'
    + (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20)) 
             +' AND SecondaryIdentifier=''' + rSecondaryIdentifier
             +''' THEN ' 
             + CASE WHEN NewValue IS NULL THEN ' Value ' ELSE '''' + NewValue + '''' END 
        FROM condensed c2 
        WHERE c1.SyncId = c2.SyncId AND  c1.TransId= c2.TransId
        FOR XML PATH('') )
     + ' END'
     + ' WHERE 1=2'
     + (SELECT ' OR (PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20)) 
         +' AND SecondaryIdentifier=''' + rSecondaryIdentifier +''')'
    FROM condensed c2 
    WHERE c1.SyncId = c2.SyncId AND  c1.TransId= c2.TransId
    FOR XML PATH('') )
INTO #UpdSql    
FROM condensed c1 
GROUP BY SyncId, TransId


SELECT * 
FROM #UpdSql
ORDER BY SyncId, TransId

修改

考虑到NewPrimaryIdentifier也可以为NULL。请参阅@TestTable中添加的行。跳过了SQL生成。

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20)) INSERT @TestTable SELECT 128, 1532102, 0, 2, 'abcd', -2, NULL, NULL, 'test data' UNION SELECT 128, 1532102, 1, 3, 'abcd', 2, NULL, NULL, NULL UNION SELECT 128, 1532102, 2, -2, 'abcd', 3, NULL, NULL, NULL UNION SELECT 128, 1532102, 3, 3, 'abcd', 4, 'efgh', NULL, NULL UNION SELECT 128, 1532102, 4, 4, 'efgh', 5, NULL, 2, NULL UNION SELECT 128, 1532102, 5, 5, 'efgh', null, 'ghfi', null, NULL -- added UNION SELECT 128, 1532106, 0, 3, 'abcd', -3, NULL, NULL, NULL UNION SELECT 128, 1532106, 1, 4, 'abcd', 3, NULL, NULL, NULL UNION SELECT 128, 1532106, 2, -3, 'abcd', 4, NULL, NULL, NULL UNION SELECT 128, 1532110, 0, 4, 'abcd', -4, NULL, NULL, NULL UNION SELECT 128, 1532110, 1, 5, 'abcd', 4, NULL, NULL, NULL UNION SELECT 128, 1532110, 2, -4, 'abcd', 5, NULL, NULL, NULL UNION SELECT 128, 1532110, 3, 5, 'abcd', 6, NULL, NULL, NULL UNION SELECT 128, 1532110, 4, 6, 'abcd', 5, NULL, NULL, NULL UNION SELECT 128, 1532114, 0, 5, 'abcd', -5, NULL, NULL, NULL UNION SELECT 128, 1532114, 1, 4, 'abcd', 5, NULL, 1, NULL UNION SELECT 128, 1532114, 2, -5, 'abcd', 4, NULL, NULL, 'some more test data' ; WITH root AS ( -- Top parent updates within transactions SELECT SyncId, TransId, TransSeq, PrimaryIdentifier AS rPrimaryIdentifier, SecondaryIdentifier AS rSecondaryIdentifier, coalesce(NewPrimaryIdentifier, PrimaryIdentifier) AS NewPrimaryIdentifier, coalesce(NewSecondaryIdentifier, SecondaryIdentifier) AS NewSecondaryIdentifier, newLevel, NewValue FROM @TestTable t WHERE NOT EXISTS (SELECT 1 FROM @TestTable t2 WHERE t2.SyncId=t.SyncId AND t2.TransId = t.TransId AND t2.TransSeq < t.TransSeq AND t.PrimaryIdentifier = coalesce(t2.NewPrimaryIdentifier, t2.PrimaryIdentifier) AND t.SecondaryIdentifier = coalesce(t2.NewSecondaryIdentifier, t2.SecondaryIdentifier) ) -- recursion to track the chain of updates UNION ALL SELECT root.SyncId, root.TransId, t.TransSeq, rPrimaryIdentifier, rSecondaryIdentifier, coalesce(t.NewPrimaryIdentifier, root.NewPrimaryIdentifier), coalesce(t.NewSecondaryIdentifier, root.NewSecondaryIdentifier), coalesce(t.NewLevel, root.NewLevel), coalesce(t.NewValue, root.NewValue) FROM root JOIN @TestTable t ON root.SyncId=t.SyncId AND root.TransId = t.TransId AND root.TransSeq < t.TransSeq AND t.PrimaryIdentifier = root.NewPrimaryIdentifier AND t.SecondaryIdentifier = root.NewSecondaryIdentifier ) ,condensed as ( -- last update in the chain SELECT TOP(1) WITH TIES * FROM root ORDER BY row_number() over (partition by SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier order by TransSeq desc) ) SELECT * FROM condensed ORDER BY SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier

Answer 3

这是获得所需输出的第一个尝试。它使用的是CURSOR，所以不要期待很好的表现。

set nocount on

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
DECLARE @OutputTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))

INSERT  @TestTable
        SELECT 128, 1532102, 0,  2, 'abcd', -2, NULL,   NULL,   'test data'
UNION   SELECT 128, 1532102, 1,  3, 'abcd',  2, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 2, -2, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 3,  3, 'abcd',  4, 'efgh', NULL,   NULL
UNION   SELECT 128, 1532102, 4,  4, 'efgh',  5, NULL,   2,      NULL
UNION   SELECT 128, 1532106, 0,  3, 'abcd', -3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 1,  4, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 2, -3, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 0,  4, 'abcd', -4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 1,  5, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 2, -4, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 3,  5, 'abcd',  6, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 4,  6, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 0,  5, 'abcd', -5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 1,  4, 'abcd',  5, NULL,   1,      NULL
UNION   SELECT 128, 1532114, 2, -5, 'abcd',  4, NULL,   NULL,   'some more test data'

--SELECT * FROM @TestTable

declare @cSyncId int, @cTransId int, @cTransSeq int, @cPrimaryId int, @cSecondaryId nchar(4), @cNewPrimaryId int, @cNewSecondary nchar(4), @cNewLevel int, @cNewValue nvarchar(20)
declare @newTransSeq int, @prevSyncId int, @prevTransId int
set @newTransSeq = 0
set @prevSyncId = 0
set @prevTransId = 0

declare auditCursor CURSOR for
    select SyncId, TransId, TransSeq, PrimaryIdentifier, SecondaryIdentifier,
        isnull(NewPrimaryIdentifier, PrimaryIdentifier) as NewPrimaryIdentifier,
        isnull(NewSecondaryIdentifier, SecondaryIdentifier) as NewSecondaryIdentifier,
        NewLevel, NewValue
    from @TestTable
    order by SyncId, TransId, TransSeq

open auditCursor

fetch next from auditCursor into @cSyncId, @cTransId, @cTransSeq, @cPrimaryId, @cSecondaryId, @cNewPrimaryId, @cNewSecondary, @cNewLevel, @cNewValue
while @@FETCH_STATUS = 0
begin
    if @prevSyncId != @cSyncId or @prevTransId != @cTransId
    begin
        set @newTransSeq = 0
        set @prevSyncId = @cSyncId
        set @prevTransId = @cTransId
    end

    if(not exists(select * from @OutputTable where SyncId = @cSyncId and TransId = @cTransId and NewPrimaryIdentifier = @cPrimaryId and NewSecondaryIdentifier = @cSecondaryId))
        begin
            insert into @OutputTable values(@cSyncId, @cTransId, @newTransSeq, @cPrimaryId, @cSecondaryId, @cNewPrimaryId, @cNewSecondary, @cNewLevel, @cNewValue)
            set @newTransSeq = @newTransSeq + 1
        end
    else
        begin
            update @OutputTable
            set NewPrimaryIdentifier = isnull(@cNewPrimaryId, NewPrimaryIdentifier),
                NewSecondaryIdentifier = isnull(@cNewSecondary, NewSecondaryIdentifier),
                NewLevel = isnull(@cNewLevel, NewLevel),
                NewValue = isnull(@cNewValue, NewValue)
            where SyncId = @cSyncId
                and TransId = @cTransId
                and NewPrimaryIdentifier = @cPrimaryId
                and NewSecondaryIdentifier = @cSecondaryId
        end

    fetch next from auditCursor into @cSyncId, @cTransId, @cTransSeq, @cPrimaryId, @cSecondaryId, @cNewPrimaryId, @cNewSecondary, @cNewLevel, @cNewValue
end
deallocate auditCursor

select 
    SyncId, TransId, TransSeq, PrimaryIdentifier, SecondaryIdentifier,
    nullif(NewPrimaryIdentifier, PrimaryIdentifier) as NewPrimaryIdentifier,
    nullif(NewSecondaryIdentifier, SecondaryIdentifier) as NewSecondaryIdentifier,
    NewLevel, NewValue
from @OutputTable order by 1,2,3

据我所知，这将提供您想要的输出。但是，如果这实际上是应该想要的输出，那么它取决于你想要做什么。

例如，如果您要使用输出以某种方式生成更新脚本以同步数据库的副本，以使副本与源数据库保持同步，则无法使用。

如果我们查看事务1532106的输出，则精简审计将主ID 3更改为4，然后将主ID 4更改为3.这当然不起作用。

根据审计跟踪的外观，当需要释放行上的id时，操作表的程序似乎会将主ID切换为负值。如果我们在样本中更改一行：

if(not exists(select * from @OutputTable where SyncId = @cSyncId and TransId = @cTransId and NewPrimaryIdentifier = @cPrimaryId and NewSecondaryIdentifier = @cSecondaryId))

到

if(not exists(select * from @OutputTable where SyncId = @cSyncId and TransId = @cTransId and NewPrimaryIdentifier = @cPrimaryId and NewSecondaryIdentifier = @cSecondaryId) or @cPrimaryId < 0)

（添加or @cPrimaryId < 0）然后我们得到一个不同的，不那么简洁的输出，据我所知，对于上述情况应该是可行的。

Answer 4

这是一种仅使用SQL获取“最新压缩记录”的方法。由于我没有完整的数据集，我无法告诉你它的表现如何。

DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT  @TestTable
        SELECT 128, 1532102, 0,  2, 'abcd', -2, NULL,   NULL,   'test data'
UNION   SELECT 128, 1532102, 1,  3, 'abcd',  2, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 2, -2, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532102, 3,  3, 'abcd',  4, 'efgh', NULL,   NULL
UNION   SELECT 128, 1532102, 4,  4, 'efgh',  5, NULL,   2,      NULL
UNION   SELECT 128, 1532106, 0,  3, 'abcd', -3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 1,  4, 'abcd',  3, NULL,   NULL,   NULL
UNION   SELECT 128, 1532106, 2, -3, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 0,  4, 'abcd', -4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 1,  5, 'abcd',  4, NULL,   NULL,   NULL
UNION   SELECT 128, 1532110, 2, -4, 'abcd',  5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 0,  5, 'abcd', -5, NULL,   NULL,   NULL
UNION   SELECT 128, 1532114, 1,  4, 'abcd',  5, NULL,   1,      NULL
UNION   SELECT 128, 1532114, 2, -5, 'abcd',  4, NULL,   NULL,   'some more test data';

WITH data AS (
  SELECT *
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN PrimaryIdentifier IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_PrimaryIdentifier
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN SecondaryIdentifier IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_SecondaryIdentifier
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN NewPrimaryIdentifier IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_NewPrimaryIdentifier
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN NewSecondaryIdentifier IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_NewSecondaryIdentifier
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN NewLevel IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_NewLevel
       , ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN NewValue IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_NewValue
    FROM @TestTable
)
, transIds
AS (
  SELECT DISTINCT SyncId, TransId
    FROM @TestTable)
SELECT t.SyncId
     , t.TransId
     , (SELECT d.PrimaryIdentifier FROM data d WHERE d.TransId = t.TransId AND d.rn_PrimaryIdentifier = 1) AS PrimaryIdentifier
     , (SELECT d.SecondaryIdentifier FROM data d WHERE d.TransId = t.TransId AND d.rn_SecondaryIdentifier = 1) AS SecondaryIdentifier
     , (SELECT d.NewPrimaryIdentifier FROM data d WHERE d.TransId = t.TransId AND d.rn_NewPrimaryIdentifier = 1) AS NewPrimaryIdentifier
     , (SELECT d.NewSecondaryIdentifier FROM data d WHERE d.TransId = t.TransId AND d.rn_NewSecondaryIdentifier = 1) AS NewSecondaryIdentifier
     , (SELECT d.NewLevel FROM data d WHERE d.TransId = t.TransId AND d.rn_NewLevel = 1) AS NewLevel
     , (SELECT d.NewValue FROM data d WHERE d.TransId = t.TransId AND d.rn_NewValue = 1) AS NewValue
  FROM transIds t;

我正在使用两个CTE。 “data”包含所有数据以及用于每个感兴趣列的行的优先级顺序。 “transIds”只是TransIds的不同列表，因此最终结果将在原始数据集中的每个Transaction Id中有一行。

注意在数据CTE中使用窗口函数：

, ROW_NUMBER() OVER(PARTITION BY TRANSID ORDER BY CASE WHEN PrimaryIdentifier IS NULL THEN 1 ELSE 0 END, TRANSSeq desc) AS rn_PrimaryIdentifier

Windows函数背后的逻辑是使相应列中具有非空值的最新行具有值“1”。打破它：

ROWNUMBER（）：获取一系列数字
PARTITION BY TRANSID：重新启动每个不同TransId的序列
在列之后按顺序排序那么1 ELSE 0 END：将所有空值排序到应用序列之前的结尾。
（ORDER BY）TRANSSeq desc：排序所以最新的TransSeq是第一个。

在最终选择中，对于每个TransId，我查询数据表以根据我之前窗口的函数获取每列的最新非空值：

 , (SELECT d.PrimaryIdentifier FROM data d WHERE d.TransId = t.TransId AND d.rn_PrimaryIdentifier = 1) AS PrimaryIdentifier

在原始问题中，您要求同时获取原始值和最新值。我不确定这是否有意义。如果您希望每次更改时都有自己的审核日志，那么您应该在更新之前将“当前”行保存到数据库中的审核日志表中。如果你真的想要原始数据集中的第一行，那么我建议将union与上面的查询结合起来。只需将此代码附加到上述查询：

  UNION ALL
SELECT SyncId, TransId, PrimaryIdentifier, SecondaryIdentifier, NewPrimaryIdentifier, NewSecondaryIdentifier, NewLevel, NewValue
  FROM @TestTable
 WHERE TransSeq = 0
 ORDER BY TransId;

SQL Server中最常用的方法是将多个数据更改压缩为值

4 个答案: