我有下表:
ONBackup表:
Contract FromDate Invoice Data
232 12/12/2017 123
232 14/02/2018 123
232 15/07/2018 123
232 14/02/2017 676
311 12/12/2017 881
“重复”的行很多,对我来说重复的是发票号相同,即其他字段可以不同。
该表有140万行(大约有100万行重复),所以不确定以下内容是否可以工作,因为我无聊地等待3个小时并开始计数,它肯定比我要占用更多CPU。< / p>
DELETE FROM ONBackup
WHERE Invoice NOT IN
(
SELECT MIN(Invoice)
FROM ONBackup
GROUP BY Invoice
)
有没有一种更快的方法可以起作用?
答案 0 :(得分:5)
使用row_number()
函数:
delete b
from (select b.*, row_number() over (partition by b.invoice order by b.fromdate desc) as seq
from ONBackup b
) b
where seq > 1;
这将为每个fromdate
留下最新的invoice
。
答案 1 :(得分:4)
在这里,我认为CTE是一个不错的选择:(请注意,您必须在前面的语句中使用分号结束)。
WITH CTE AS
(
SELECT Invoice, ROW_NUMBER() OVER (PARTITION BY INVOICE ORDER BY SELECT '1') AS RowNumb
FROM ONBackup
)
DELETE FROM CTE WHERE RowNumb > 1
答案 2 :(得分:3)
DELETE A
FROM
(
select *,row_number() over (partition by invoice order by invoice)as rn from
table1
) A
WHERE A.rn > 1