我的表格中有重复的行,如何根据单个列的值删除它们?
例如
uniqueid, col2, col3 ...
1, john, simpson
2, sally, roberts
1, johnny, simpson
delete any duplicate uniqueIds
to get
1, John, Simpson
2, Sally, Roberts
答案 0 :(得分:35)
您可以从cte:{/ p> DELETE
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank > 1
ROW_NUMBER()
函数为每一行分配一个数字。 PARTITION BY
用于为该组中的每个项目开始编号,在这种情况下,uniqueid
的每个值都将从1开始编号并从那里开始。 ORDER BY
确定数字的顺序。由于每个uniqueid
从1开始编号,因此ROW_NUMBER()
大于1的任何记录都有重复uniqueid
要了解ROW_NUMBER()
函数的工作原理,请尝试一下:
SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid
您可以调整ROW_NUMBER()
功能的逻辑来调整您要保留或删除的记录。
例如,您可能希望通过多个步骤执行此操作,首先删除具有相同姓氏但名字不同的记录,您可以将姓氏添加到PARTITION BY
:
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank > 1
答案 1 :(得分:2)
您可能有一个行ID,在插入时由DB分配 ,并且实际上是唯一的。我将在我的例子中将这个rowId称为。
rowId |uniqueid |col2 |col3
----- |-------- |---- |----
1 10 john simpson
2 20 sally roberts
3 10 johnny simpson
您可以通过对应该是唯一的事物(无论是一列还是多列)进行分组来删除重复项,然后从每个组中获取rowId,并删除除rowIds之外的所有其他内容。在内部查询中,除了重复的行之外,表中的所有内容都将包含rowId。
select *
--DELETE
FROM MyTable
WHERE rowId NOT IN
(SELECT MIN(rowId)
FROM MyTable
GROUP BY uniqueid);
您也可以使用MAX而不是MIN来获得类似的结果。
答案 2 :(得分:1)
DELETE FROM table WHERE uniqueid='1' AND col2='john'
或者您将col2='john'
更改为col2='johnny'
。取决于您要删除的记录。
你是如何在一开始就得到两个相同的“唯一”ID?
答案 3 :(得分:1)
DECLARE @du TABLE (
id INT,
Name VARCHAR(4)
)
INSERT INTO @du VALUES(1,'john')
INSERT INTO @du VALUES(2,'jane')
INSERT INTO @du VALUES(1,'john')
;WITH dup (id,dp)
AS
(SELECT id
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY Name) AS dp
FROM @du)
DELETE FROM dup
WHERE dp > 1
SELECT *
FROM @du
答案 4 :(得分:1)
这是删除重复项的简单方法
select * into NewTable from ExistingTable
union
select * from ExistingTable;
答案 5 :(得分:1)
您可以通过多种方式删除重复记录,其中一些记录位于以下...........
删除重复记录的不同方法
使用Row_Number()函数和CTE
with CTE(DuplicateCount) as ( SELECT ROW_NUMBER() OVER
(PARTITION by UniqueId order by UniqueId ) as DuplicateCount from
Table1 ) Delete from CTE where DuplicateCount > 1
.Without using CTE*
Delete DuplicateCount from ( Select Row_Number() over(Partition by
UniqueId order by UniqueId) as Dup from Table1 ) DuplicateCount
where DuplicateCount.Dup > 1
.Without using row_Number() and CTE
Delete from Subject where RowId not in(select Min(RowId ) from
Subject group by UniqueId)