参加下面的tsql查询:
DECLARE @table TABLE(data VARCHAR(20))
INSERT INTO @table VALUES ('not duplicate row')
INSERT INTO @table VALUES ('duplicate row')
INSERT INTO @table VALUES ('duplicate row')
INSERT INTO @table VALUES ('second duplicate row')
INSERT INTO @table VALUES ('second duplicate row')
SELECT data
INTO #duplicates
FROM @table
GROUP BY data
HAVING COUNT(*) > 1
-- delete all rows that are duplicated
DELETE FROM @table
FROM @table o INNER JOIN #duplicates d
ON d.data = o.data
-- insert one row for every duplicate set
INSERT INTO @table(data)
SELECT data
FROM #duplicates
我理解它在做什么,但是逻辑的最后一部分(在每个重复集合后插入一行)没有意义。我们有一组代码--delete所有重复的行,这样就可以删除重复项,那么最后一部分是什么?
找到此查询here
由于
答案 0 :(得分:5)
如果我们有--delete所有重复行的代码集,那么去除重复项的那些代码是什么?
首先,它删除所有曾经重复的行。也就是说,所有行,原始也是。在上面的情况中,'not duplicate row'
之后只有一行(DELETE
)将保留在表格中。其他所有四行都将被删除。
然后再次使用已删除的行填充表,但现在删除重复项。
这不是删除重复项的最佳方式。
最好的方法是:
WITH q AS (
SELECT data, ROW_NUMBER() OVER (PARTITION BY data ORDER BY data) AS rn
FROM @table
)
DELETE
FROM q
WHERE rn > 1
答案 1 :(得分:0)
删除调用将删除所有匹配记录。
由于删除了所有重复的行,最后一行重新插入一行。
答案 2 :(得分:0)
Create table Test (Test1 int not null , Test2 varchar(10) null )
Insert Into Test
Select 12, 'abc'
UNion All
Select 13 , 'def'
Insert Into Test
Select 12, 'abc'
UNion All
Select 13 , 'def'
Select * From Test
WITH t1 AS
(SELECT ROW_NUMBER ( ) OVER ( PARTITION BY test1, test2 ORDER BY test1)
AS RNUM FROM Test )
DELETE FROM t1 WHERE RNUM > 1