Question

我有一个这样的表，那里有重复的记录，所以我的要求是识别重复的记录并存储到另一个表中，即Customer_duplicate
并将不同的记录合并到一个表中

现有查询：

Create proc usp_store_duplicate_into_table 
as 
begin 
    insert into Customer_Duplicate 
    select * 
    from Customer C 
    group by cid 
    having count(cid) > 1

Answer 1

您拥有的一切都很好，但是您不能选择不在您的分组依据中的项目；例如，您可以这样做：

insert into Customer_Duplicate 
select cid, count(*)
from Customer C 
group by cid 
having count(cid) > 1

取决于Customer_Duplicate的外观。如果您确实需要包括所有行，那么类似的方法可能对您有用：

insert into Customer_Duplicate 
select *
from customer c
where c.cid in
(
    select cid
    from Customer
    group by cid 
    having count(cid) > 1
)

Answer 2

要查找重复项，可以使用以下代码。

insert into Customer_Duplicate 
SELECT c.name, c.othercolumns
    (select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
    from Customer C 
    ) AS c
WHERE c.rnk >1;

如果要将不同的记录插入到另一个表中，可以使用下面的代码。

 insert into Customer_Distinct 
    SELECT c.name, c.othercolumns
        (select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
        from Customer C 
        ) AS c
    WHERE c.rnk = 1;

Answer 3

您可以在SQL Server中将Row_Number()的排名函数与Partition By一起使用来识别重复的行。在“分区依据”中，您可以定义要查找重复记录的列数。例如，我正在使用“名称”和“否”，则可以将其替换为您的列名。

insert into Customer_Duplicate
SELECT * FROM (
select * , ROW_NUMBER() OVER(PARTITION BY NAME,NO ORDER BY NAME,NO) AS RNK
from Customer C 
) AS d
WHERE rnk > 1

识别重复记录并插入另一个表

3 个答案: