SQL查询识别然后删除除最新记录之外的所有内容

时间:2014-09-02 21:25:45

标签: sql duplicates duplicate-removal

我有以下查询,它根据employee_id字段标识重复记录。

SELECT ROW_NUMBER() OVER(PARTITION BY c1.employee_id ORDER BY c1.lastlogon ASC ) AS Row
    ,[DN]
    ,[first_name]
    ,[last_name]
    ,[init]
    ,[email]
    ,[title]
    ,[display_name]
    ,[department]
    ,[phone_num]
    ,[mob_num]
    ,[fax_num]
    ,[pager_num]
    ,[logon]
    ,[post_code]
    ,[www]
    ,[objectSID]
    ,[disabled]
    ,[lastlogon]
    ,[employee_id]
    ,[acc_type]
FROM AD_Users_All_Staging c1
WHERE EXISTS
(
     SELECT 1
     FROM AD_Users_All_Staging c2
     WHERE c2.employee_id = c1.employee_id
     GROUP BY 
         employee_id
     HAVING COUNT(1) > 1  -- more than one value
)

如何仅选择存在重复的最新记录(lastlogon字段中的值)(employee_id字段中的值)

跟进问题是除了最新记录外,如何删除每个副本的所有记录?

非常感谢

3 个答案:

答案 0 :(得分:0)

由于我没有您的数据,我无法轻易尝试......但是,如果您更改了窗口函数以使用c.lastlogon Desc而不是 ASC 即可。然后,您将始终保留第一条记录Row = 1并删除其余的Row > 1

答案 1 :(得分:0)

您可以使用以下方式选择最新记录:

select uas.*
from AD_Users_All_Staging uas
where not exists (select 1
                  from AD_Users_All_Staging uas2
                  where uas2.employee_id = uas.employee_id and
                        uas2.lastlogon > uas.lastlogon
                 );

您可以使用反逻辑执行delete

select uas.*
from AD_Users_All_Staging uas
where exists (select 1
              from AD_Users_All_Staging uas2
              where uas2.employee_id = uas.employee_id and
                    uas2.lastlogon > uas.lastlogon
             );

答案 2 :(得分:0)

我有点头疼:

我试过这个,看起来它给了我所需的结果:

;WITH cte AS
(SELECT ROW_NUMBER() OVER(PARTITION BY c1.employee_id ORDER BY c1.lastlogon DESC) AS Row
    ,[DN]
    ,[first_name]
    ,[last_name]
    ,[init]
    ,[email]
    ,[title]
    ,[display_name]
    ,[department]
    ,[phone_num]
    ,[mob_num]
    ,[fax_num]
    ,[pager_num]
    ,[logon]
    ,[post_code]
    ,[www]
    ,[objectSID]
    ,[disabled]
    ,[lastlogon]
    ,[employee_id]
    ,[acc_type]
FROM AD_Users_All_Staging c1
WHERE EXISTS
(
     SELECT 1
     FROM AD_Users_All_Staging c2
     WHERE c2.employee_id = c1.employee_id
     GROUP BY 
         employee_id
     HAVING COUNT(1) > 1  -- more than one value
)
)
SELECT * FROM cte
WHERE row != 1

这看起来不错吗?