我有以下查询,它根据employee_id字段标识重复记录。
SELECT ROW_NUMBER() OVER(PARTITION BY c1.employee_id ORDER BY c1.lastlogon ASC ) AS Row
,[DN]
,[first_name]
,[last_name]
,[init]
,[email]
,[title]
,[display_name]
,[department]
,[phone_num]
,[mob_num]
,[fax_num]
,[pager_num]
,[logon]
,[post_code]
,[www]
,[objectSID]
,[disabled]
,[lastlogon]
,[employee_id]
,[acc_type]
FROM AD_Users_All_Staging c1
WHERE EXISTS
(
SELECT 1
FROM AD_Users_All_Staging c2
WHERE c2.employee_id = c1.employee_id
GROUP BY
employee_id
HAVING COUNT(1) > 1 -- more than one value
)
如何仅选择存在重复的最新记录(lastlogon字段中的值)(employee_id字段中的值)
跟进问题是除了最新记录外,如何删除每个副本的所有记录?
非常感谢
答案 0 :(得分:0)
由于我没有您的数据,我无法轻易尝试......但是,如果您更改了窗口函数以使用c.lastlogon Desc
而不是 ASC 即可。然后,您将始终保留第一条记录Row = 1
并删除其余的Row > 1
。
答案 1 :(得分:0)
您可以使用以下方式选择最新记录:
select uas.*
from AD_Users_All_Staging uas
where not exists (select 1
from AD_Users_All_Staging uas2
where uas2.employee_id = uas.employee_id and
uas2.lastlogon > uas.lastlogon
);
您可以使用反逻辑执行delete
:
select uas.*
from AD_Users_All_Staging uas
where exists (select 1
from AD_Users_All_Staging uas2
where uas2.employee_id = uas.employee_id and
uas2.lastlogon > uas.lastlogon
);
答案 2 :(得分:0)
我有点头疼:
我试过这个,看起来它给了我所需的结果:
;WITH cte AS
(SELECT ROW_NUMBER() OVER(PARTITION BY c1.employee_id ORDER BY c1.lastlogon DESC) AS Row
,[DN]
,[first_name]
,[last_name]
,[init]
,[email]
,[title]
,[display_name]
,[department]
,[phone_num]
,[mob_num]
,[fax_num]
,[pager_num]
,[logon]
,[post_code]
,[www]
,[objectSID]
,[disabled]
,[lastlogon]
,[employee_id]
,[acc_type]
FROM AD_Users_All_Staging c1
WHERE EXISTS
(
SELECT 1
FROM AD_Users_All_Staging c2
WHERE c2.employee_id = c1.employee_id
GROUP BY
employee_id
HAVING COUNT(1) > 1 -- more than one value
)
)
SELECT * FROM cte
WHERE row != 1
这看起来不错吗?