Question

由于存在错误，我现在有可能某些表在主键列中包含重复数据的行。

假设我的表T具有主键列A，B，C和D，以及非PK列E，F和G.为了使行唯一，A B C和D all必须具有唯一值。我可以有A相同的行，或者A和B具有相同的值，甚至AB和C.但是如果我有两行，其中ABC和D都具有相同的值，那将是一个问题。

这是找到这种情况的正确方法：

SELECT A, B, C, D, COUNT(*) AS 'Duplicates' FROM T
   GROUP BY A, B, C, D
   HAVING COUNT(*) > 1

感谢您的帮助。

Answer 1

以下查询将返回colA，colB，colC和colD重复的所有元组。我实际上在工作中使用此代码来从表中删除重复的条目。（将最后的选择切换为删除，删除所有重复项，同时在表中留下一个条目）

with a as
   (SELECT 
         colA
         ,colB
         ,colC
         ,colD
      ,ROW_NUMBER() OVER(PARTITION by colA
                                             ,colB
                                             ,colC
                                             ,colD
                               ) as duplicateRecCount
    FROM Table)

 Select * from a 
 where duplicateRecCount > 1

Answer 2

您是否使用声明性参照完整性？如果没有，为什么不呢？

这样的事应该对你有用：

with duplicate_row as
(
  select distinct
         x.A ,
         x.B ,
         x.C ,
         x.D ,
         x.E ,
         x.F ,
         x.G
  from ( select * ,
                seq = row_number() over (
                        partition by A,B,C,D
                        order by E,F,G
                        )
         from dbo.my_table
       ) x
  where x.seq > 1
)
delete dbo.my_table
from dbo.my_table  t
join duplicate_row d on d.A = t.A -- IMPORTANT:
                    and d.B = t.B -- you must join against ALL
                    and d.C = t.C -- columns, key and non-key
                    and d.D = t.D -- lest you blow away data
                    and d.E = t.E -- inadvertantly
                    and d.F = t.F
                    and d.G = t.G
GO

alter table dbo.my_table add constraint
  my_table_PK primary key clustered (A,B,C,D)
GO

澄清识别重复主键

2 个答案: