SQL Server Duplicate Records删除最旧的记录并保持最新状态

时间:2012-05-21 10:20:36

标签: sql sql-server date duplicate-removal

我在SQL Server中有一个与此类似的表:

Emp#        CourseID        DateComplete        Status
1           Course1         21/05/2012          Failed
1           Course1         22/05/2012          Passed
2           Course2         22/05/2012          Passed
3           Course3         22/05/2012          Passed
4           Course1         31/01/2012          Failed
4           Course1         28/02/2012          Passed
4           Course2         28/02/2012          Passed

尝试为每个emp#捕获每个课程的最新记录。如果在同一天尝试了相同的课程,则抓住“通过的”课程记录。

按照这些思路思考:

SELECT DISTINCT .....
        INTO Dup_Table
        FROM MainTable
GROUP BY ........
HAVING COUNT(*) > 1

DELETE MainTable
        WHERE Emp# IN (SELECT Emp# FROM Dup_Table)

INSERT MainTable SELECT * FROM Dup_Table

Drop Table Dup_Table
GO

但不确定这是否是

  1. 最好的方法和
  2. 如何将Emp#/ courseID / DateComplete / Status整合在一起。

2 个答案:

答案 0 :(得分:5)

;WITH cte 
     AS (SELECT Row_number() OVER (partition BY EMPID, courseid ORDER BY 
                DateComplete 
                DESC, 
                status DESC) RN 
         FROM   MainTable) 
DELETE FROM cte 
WHERE  RN > 1 

答案 1 :(得分:0)

您可以按分区使用row_number()并按范围排序以获取最后一条记录

Select *
From  (
    Select *,
           Row_Number() Over (Partition By Emp#, CourseID Order By DateComplete DESC, Case When Status = 'Passed' Then 1 Else 2 End  ) AS RecordNumber
    From #Emp)Z
Where Z.RecordNumber = 1