我有一个查询,如下所示:
;WITH Duplicates AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ChannelName, SerialNumber, ReadingDate ORDER BY ChannelName) AS Rownumber
FROM [Staging].[UriData]
)
DELETE FROM Duplicates WHERE Rownumber > 1
--AND ROWNUMBER >=< ???
OPTION (MAXRECURSION 0)
这很好用,可以在表中找到重复项。但是,该表经常使用更正后的数据进行更新。
到查询运行时,可能已经有三个或更多更新。
这意味着我要删除除最新记录以外的所有记录。表中有一个timestamp字段,它表示最近一次插入的时间。我假设我应该使用此字段来确定哪个是最新行,而不是不是最高行号的行,则将其删除。这是正确的方法吗?
TIA
答案 0 :(得分:3)
您当然可以将timestamp
列与ROW_NUMBER()
一起使用,并且您无需使用递归提示,因为您的CTE
没有任何递归级别。
;WITH Duplicates AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ChannelName, SerialNumber, ReadingDate ORDER BY timestamp DESC) AS Rownumber
FROM [Staging].[UriData]
)
DELETE d
FROM Duplicates d
WHERE Rownumber > 1;
答案 1 :(得分:1)
DELETE older
FROM Staging.UriData older
WHERE EXISTS(SELECT 1
FROM Staging.UriData newer
WHERE newer.ChannelName = older.older
and newer.SerialNumber = older.SerialNumber
and newer.ReadingDate = older.ReadingDate
and newer.timestamp > older.timestamp
)