我有一个这样的表,那里有重复的记录,所以我的要求是识别重复的记录并存储到另一个表中,即Customer_duplicate
并将不同的记录合并到一个表中
现有查询:
Create proc usp_store_duplicate_into_table
as
begin
insert into Customer_Duplicate
select *
from Customer C
group by cid
having count(cid) > 1
答案 0 :(得分:0)
您拥有的一切都很好,但是您不能选择不在您的分组依据中的项目;例如,您可以这样做:
insert into Customer_Duplicate
select cid, count(*)
from Customer C
group by cid
having count(cid) > 1
取决于Customer_Duplicate
的外观。如果您确实需要包括所有行,那么类似的方法可能对您有用:
insert into Customer_Duplicate
select *
from customer c
where c.cid in
(
select cid
from Customer
group by cid
having count(cid) > 1
)
答案 1 :(得分:0)
要查找重复项,可以使用以下代码。
insert into Customer_Duplicate
SELECT c.name, c.othercolumns
(select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
from Customer C
) AS c
WHERE c.rnk >1;
如果要将不同的记录插入到另一个表中,可以使用下面的代码。
insert into Customer_Distinct
SELECT c.name, c.othercolumns
(select c.name,c.othercolumns, ROW_NUMBER() OVER(PARTITION BY cid ORDER BY 1) AS rnk
from Customer C
) AS c
WHERE c.rnk = 1;
答案 2 :(得分:0)
您可以在SQL Server中将Row_Number()
的排名函数与Partition By
一起使用来识别重复的行。
在“分区依据”中,您可以定义要查找重复记录的列数。
例如,我正在使用“名称”和“否”,则可以将其替换为您的列名。
insert into Customer_Duplicate
SELECT * FROM (
select * , ROW_NUMBER() OVER(PARTITION BY NAME,NO ORDER BY NAME,NO) AS RNK
from Customer C
) AS d
WHERE rnk > 1