从表中删除重复记录并保留最新记录

时间:2012-11-16 15:27:19

标签: sql sql-server-2008

我想从表中删除任何重复记录,并保留最新记录(根据日期)。在下面的示例中,将删除第一条记录(hdate = 2012-07-01,id = 16)。

使用Sql Server 2008

由于

hdate      id           secId       pricesource          price         
---------- ------------ ----------- -------------------- --------------
2012-07-01 16           126         DFLT                 NULL          
2012-07-02 16           126         DFLT                 NULL          
2012-07-01 CAD          20          DFLT                 1             
2012-07-01 TWD          99          DFLT                 1   

3 个答案:

答案 0 :(得分:2)

使用Sql-Server 2005或更高版本,您可以在CTE中使用ROW_NUMBER OVER来使用Here's a Sql-Fiddle demo

WITH CTE AS
(
  SELECT hdate, id, secId, pricesource, price,
  ROW_NUMBER() OVER (PARTITION BY id, secId, pricesource, price ORDER BY hdate DESC) AS RN
  FROM dbo.TableName t
)
DELETE FROM CTE WHERE RN > 1

{{3}}

答案 1 :(得分:0)

如果你的RDBMS不支持CTE,或者能够从它们中删除(因为你没有列出你正在使用的东西),这里有一个版本用于其他一切:

DELETE FROM TableName as a
WHERE EXISTS (SELECT '1'
              FROM TableName b
              WHERE b.id = a.id  -- Plus all other 'duplicate' columns
                    AND b.hdate > a.hdate);

(Tim已修改Fiddle demo - 虽然由于某种原因这对SQL Server不起作用。)

答案 2 :(得分:0)

这不如Tim的解决方案那么优雅,但不需要CTE。它还将列中的空值视为等效。

DELETE
FROM MyTable m1
WHERE EXISTS (
    SELECT 1
    FROM MyTable m2
    WHERE 
        (m2.id = m1.id OR (m2.id IS NULL AND m1.id IS NULL))
    AND (m2.secId = m1.secId OR (m2.secId IS NULL AND m1.secId IS NULL))
    AND (m2.pricesource = m1.pricesource OR (m2.pricesource IS NULL AND m1.pricesource IS NULL))
    AND (m2.price = m1.price  OR (m2.price IS NULL AND m1.price IS NULL))
    AND m2.hdate > m1.hdate
);