TSQL 2012 - MERGE - OUTPUT - 更改数据捕获 - 插入和删除联合

时间:2015-11-09 17:02:17

标签: sql sql-server tsql merge sql-server-2012

我刚刚开始使用MERGE语句的OUTPUT运算符,并立即弹出一个问题:

有一张桌子:

CREATE TABLE t1 (
     id INT
    , somedata VARCHAR(10)
    , someotherdata INT
    );

...其中包含一些测试数据:

INSERT  INTO dbo.t1
VALUES  ( 1, 'aaa', 100 ),
        ( 2, 'bbb', 200 ),
        ( 3, 'ccc', 300 ),
        ( 4, 'ccc', 444 ),
        ( 5, 'rrr', 543 );

还有一个更改数据捕获表,如下所示:

SELECT TOP 0 CONVERT(TINYINT, 0) action -- delete (-1) / update (0) / insert (1)
      , CONVERT(BIGINT, 0) execution_id -- from a sequence, 
                                        -- to distinguish rows
                                        -- for each MERGE operation
      , CONVERT(BIT, 0) row_version -- 0: old, 1: new
      , t2.*
INTO    t1_cdc
FROM    dbo.t1
LEFT OUTER JOIN dbo.t1 t2 ON 0 = 1;

(左外连接保证CDC表中每列的可为空性)

有一个MERGE声明:

;WITH    cte_sample_rows
          AS ( SELECT   1 id
                      , 'aaa' somedata
                      , 100 someotherdata    -- same
               UNION ALL
               SELECT   2
                      , 'bbb'
                      , 200                  -- same
               UNION ALL
               SELECT   3
                      , 'fff'
                      , 333                  -- update
               UNION ALL
               SELECT   4
                      , 'ccc'
                      , 444                  -- same
               UNION ALL
               SELECT   50
                      , 'xxx'
                      , 5050                 -- insert id=50 / delete id=5
               )
    MERGE dbo.t1 tgt
    USING cte_sample_rows src
    ON tgt.id = src.id
    WHEN MATCHED THEN
        UPDATE SET
               tgt.somedata = src.somedata
             , tgt.someotherdata = src.someotherdata
    WHEN NOT MATCHED BY TARGET THEN
        INSERT
        VALUES ( src.id
               , src.somedata
               , src.someotherdata
               )
    WHEN NOT MATCHED BY SOURCE THEN
        DELETE;

现在,我需要捕获每行的旧版本和新版本,作为CDC表中的两个单独行。

我知道可以将OUTPUT子句添加到上面的MERGE语句中,如下所示:

...
OUTPUT
    CASE $action
      WHEN 'DELETE' THEN -1
      WHEN 'UPDATE' THEN 0
      ELSE 1 -- 'INSERT'
    END AS action
  , 1 AS execution_id
  , Deleted.id id_old
  , Deleted.somedata somedata_old
  , Deleted.someotherdata someotherdata_old
  , Inserted.id id_new
  , Inserted.somedata somedata_new
  , Inserted.someotherdata someotherdata_new ;

...但是这会将所有旧值和新值作为单行返回(对于每个输入行)。我需要“取消”它们以获得单独的行:一个用于插入,另一个用于已删除。对于删除,只有已删除,对于插入 - 仅插入,对于更新 - 两者(我将它们与版本号区分开来:0:old,1:new)

我知道我可以填充临时#temp表,然后在单独的步骤中“取消”来自该表的数据,但我正在寻找一步操作。

类似的东西:

...
OUTPUT
(SELECT -1 AS action -- DELETE
  , 1 AS execution_id
  , 0 AS row_version -- old
  , Deleted.*
  WHERE $action = 'DELETE'
  UNION ALL SELECT 0 AS action -- UPDATE'
  , 1 AS execution_id
  , 0 AS row_version -- old
  , Deleted.*
  WHERE $action = 'UPDATE'
  UNION ALL SELECT 0 AS action -- UPDATE'
  , 1 AS execution_id
  , 1 AS row_version -- new
  , Inserted.*
  WHERE $action = 'UPDATE'
  UNION ALL SELECT 1 AS action -- INSERT
  , 1 AS execution_id
  , 1 AS row_version -- new
  , Inserted.*
  WHERE $action = 'INSERT')

因此,再次:每个已删除记录的一个CDC行,每个插入记录一个CDC行和每个更新记录的两个CDC行。

有没有办法在一步中实现这一目标?

SQL Server 2012。

1 个答案:

答案 0 :(得分:0)

  

我知道我可以填充一个临时的#temp表,然后" unpivot"来自该表的数据在单独的步骤中,但我正在寻找一步操作。

这是一个非常好的问题。 MERGE声明是 Swiss-Army Knife ,所以我尝试了这种方法:

INSERT INTO
SELECT ...
FROM (
     MERGE
     ...
     OUTPUT
     ...
     )

但是有一个问题,我不能将行分成两行,因为我无法使用GROUP BY, CROSS JOIN, CROSS APPLY, nested subquery, PIVOT, UNPIVOT, UNION ALL, cte。每次尝试都以:

结束
 A nested INSERT, UPDATE, DELETE, or MERGE statement 
 is not allowed as the table source of a PIVOT or UNPIVOT operator.

 A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed
 in a SELECT statement that is not the immediate source of rows 
 for an INSERT statement.

 A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed
 on either side of a JOIN or APPLY operator.

 The GROUP BY clause is not allowed when the FROM clause contains 
 a nested INSERT, UPDATE, DELETE, or MERGE statement.

我得到的最接近的结果集是(对于UPDATE,您将只获得新的/旧值):

;INSERT INTO #cdc(action, execution_id, rowversion,id, somedata, someotherdata)
SELECT 
    [action]
   ,[execution_id]
   ,[rowversion]    = CASE action WHEN -1 THEN 0 
                                  WHEN 1  THEN 1 
                                  WHEN 0  THEN 1
                      END
   ,[id]            = CASE action WHEN -1 THEN id_old 
                                  WHEN 1  THEN id_new 
                                  WHEN 0  THEN id_old
                      END
   ,[somedata]      = CASE action WHEN -1 THEN somedata_old
                                  WHEN 1  THEN somedata_new
                                  WHEN 0  THEN somedata_old
                      END 
   ,[someotherdata] = CASE action WHEN -1 THEN someotherdata_old 
                                  WHEN 1  THEN someotherdata_new
                                  WHEN 0  THEN someotherdata_old
                      END
FROM (
    MERGE #t1 tgt
    USING (VALUES (1, 'aaa', 100),(2, 'bbb', 200),
                  (3, 'fff', 333), (4, 'ccc', 4444),
                  (50, 'xxx', 5050)) AS src(id, somedata, someotherdata)
    ON tgt.id = src.id
    WHEN MATCHED AND NOT EXISTS (SELECT src.somedata, src.someotherdata
                                 INTERSECT
                                 SELECT tgt.somedata, tgt.someotherdata)
    THEN
        UPDATE SET
               tgt.somedata = src.somedata
             , tgt.someotherdata = src.someotherdata
    WHEN NOT MATCHED BY TARGET THEN
        INSERT
        VALUES ( src.id
               , src.somedata
               , src.someotherdata
               )
    WHEN NOT MATCHED BY SOURCE THEN
        DELETE
    OUTPUT
    CASE $action
      WHEN 'DELETE' THEN -1
      WHEN 'UPDATE' THEN 0
      ELSE 1 -- 'INSERT'
    END AS action
  , 1 AS execution_id
  , Inserted.*
  , Deleted.*) 
AS m(action, execution_id, id_new, somedata_new, someotherdata_new
    ,id_old, somedata_old, someotherdata_old);

LiveDemo

另请注意,在当前表单中,您标记为MERGE的{​​{1}}行将UPDATE。我添加了额外的条件来检查是否至少更改了一列。

WHEN MATCHED AND NOT EXISTS (SELECT src.somedata, src.someotherdata
                             INTERSECT
                             SELECT tgt.somedata, tgt.someotherdata)
THEN

另一种方法是在UPDATE/INSERT/DELETE的目标表上添加触发器,这些触发器将记录插入cdc表并忘记OUTPUT中的MERGE子句。