使用多列比较在SQL Server中使用DUP标记重复数据

时间:2016-05-04 10:29:25

标签: sql-server

我有一个名为DebitCatdTransaction的表,我想在状态列中将重复数据标记为DUPPAST,但条件是

如果存在任何具有相同TPATransactionId,Channel,EIN和Status不等于NULL的记录,则将状态列更新为DUPPAST

如果您查看下表,那么我需要将状态更新为DUPPAST,以获取第2行和第2行的信息。 3因为2& 3行数据完全相同,第一行的状态不为NULL。

下面是我的表结构和数据

+------------------+---------+--------+--------+
| TPATransactionId | Channel | EIN    | Status |
+------------------+---------+--------+--------+
| 55277801         | H       | 137580 | TBD    |
+------------------+---------+--------+--------+
| 55277801         | H       | 137580 | NULL   |
+------------------+---------+--------+--------+
| 55277801         | H       | 137580 | NULL   |
+------------------+---------+--------+--------+
| 55277801         | V       | 137580 | NULL   |
+------------------+---------+--------+--------+

下面是相同的脚本结构

DECLARE @DebitCardTransaction TABLE (TPATransactionId INT,Channel VARCHAR(50),EIN INT,Status VARCHAR(50));

INSERT @DebitCardTransaction VALUES (55277801,'H',137580,'TBD')
INSERT @DebitCardTransaction VALUES (55277801,'H',137580,NULL)
INSERT @DebitCardTransaction VALUES (55277801,'H',137580,NULL)
INSERT @DebitCardTransaction VALUES (55277801,'V',137580,NULL)

这是我到目前为止所做的

UPDATE d1
SET d1.Status = 'DUPPAST'
  From @DebitCardTransaction d1 Inner join @DebitCardTransaction d2
  ON d1.TPATransactionId = d2.TPATransactionId
AND d1.Channel=d2.Channel
  AND d1.EIN=d2.EIN
  AND d1.status is null

这就是我期待的输出

+------------------+---------+--------+---------+
| TPATransactionId | Channel | EIN    | Status  |
+------------------+---------+--------+---------+
| 55277801         | H       | 137580 | TBD     |
+------------------+---------+--------+---------+
| 55277801         | H       | 137580 | DUPPAST |
+------------------+---------+--------+---------+
| 55277801         | H       | 137580 | DUPPAST |
+------------------+---------+--------+---------+
| 55277801         | V       | 137580 | NULL    |
+------------------+---------+--------+---------+

2 个答案:

答案 0 :(得分:0)

您可以使用ROW_NUMBER

WITH cte AS
(
  SELECT *,
   rn = ROW_NUMBER() OVER(PARTITION BY TPATransactionId, Channel,EIN 
                          ORDER BY Status DESC)
  FROM @DebitCardTransaction
)
UPDATE cte
SET Status = 'DUPPAST'
WHERE Status IS NULL AND rn > 1;

SELECT *
FROM @DebitCardTransaction;

LiveDemo

注意:您应该添加ID IDENTITY(1,1)之类的列以获得稳定排序并允许获得第一条记录。

rn = ROW_NUMBER() OVER(PARTITION BY TPATransactionId, Channel,EIN ORDER BY ID)

LiveDemo2

修改

  

没有CTE

是不可能做到的

是:

UPDATE s
SET Status = 'DUPPAST'
FROM (SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY TPATransactionId, Channel,EIN 
                                       ORDER BY ID)
      FROM @DebitCardTransaction) s
WHERE Status IS NULL AND rn > 1;

LiveDemo3

答案 1 :(得分:0)

试试这个,

Thread.sleep(5000);