删除具有重复值的行

时间:2013-04-04 14:41:35

标签: sql sql-server

我必须清理一个包含重复行的表:

id: serial id
gid: group id
url: string <- this is the column that I have to cleanup

一个gid可能有多个url值:

id    gid   url
----  ----  ------------
1     12    www.gmail.com
2     12    www.some.com
3     12    www.some.com <-- duplicate
4     13    www.other.com
5     13    www.milfsome.com <-- not a duplicate

我想针对整个表执行一个查询,并删除gidurl重复的所有行。在上面的示例中,删除后,我想只剩下1,2,4和5。

2 个答案:

答案 0 :(得分:13)

;WITH x AS 
(
   SELECT id, gid, url, rn = ROW_NUMBER() OVER
     (PARTITION BY gid, url ORDER BY id) 
   FROM dbo.table
)
SELECT id,gid,url FROM x WHERE rn = 1 -- the rows you'll keep
-- SELECT id,gid,url FROM x WHERE rn > 1 -- the rows you'll delete
-- DELETE x WHERE rn > 1; -- do the delete

如果您对第一个选择感到满意,这表示您将保留的行,请将其删除并取消注释第二个选择。一旦你对它感到满意,这表示你将删除的行,删除它并取消注释删除。

如果您不想删除数据,只需忽略SELECT下的注释行...

答案 1 :(得分:1)

SELECT 
MIN(id) AS id,
gid,
url
FROM yourTable
GROUP BY gid, url