识别重复记录并插入另一个表

时间:2018-09-11 11:36:30

标签: sql sql-server

我有一个这样的表,那里有重复的记录,所以我的要求是识别重复的记录并存储到另一个表中,即Customer_duplicate
并将不同的记录合并到一个表中

现有查询:

Create proc usp_store_duplicate_into_table 
as 
begin 
    insert into Customer_Duplicate 
    select * 
    from Customer C 
    group by cid 
    having count(cid) > 1                                   

3 个答案:

答案 0 :(得分:0)

您拥有的一切都很好,但是您不能选择不在您的分组依据中的项目;例如,您可以这样做:

insert into Customer_Duplicate 
select cid, count(*)
from Customer C 
group by cid 
having count(cid) > 1 

取决于Customer_Duplicate的外观。如果您确实需要包括所有行,那么类似的方法可能对您有用:

insert into Customer_Duplicate 
select *
from customer c
where c.cid in
(
    select cid
    from Customer
    group by cid 
    having count(cid) > 1
)

答案 1 :(得分:0)

要查找重复项,可以使用以下代码。

insert into Customer_Duplicate 
SELECT c.name, c.othercolumns
    (select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
    from Customer C 
    ) AS c
WHERE c.rnk >1;

如果要将不同的记录插入到另一个表中,可以使用下面的代码。

 insert into Customer_Distinct 
    SELECT c.name, c.othercolumns
        (select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
        from Customer C 
        ) AS c
    WHERE c.rnk = 1;

答案 2 :(得分:0)

您可以在SQL Server中将Row_Number()的排名函数与Partition By一起使用来识别重复的行。 在“分区依据”中,您可以定义要查找重复记录的列数。 例如,我正在使用“名称”和“否”,则可以将其替换为您的列名。

insert into Customer_Duplicate
SELECT * FROM (
select * , ROW_NUMBER() OVER(PARTITION BY NAME,NO ORDER BY NAME,NO) AS RNK
from Customer C 
) AS d
WHERE rnk > 1