我正在编写查询以查找重复的记录。我的桌子上有以下列
Id, Deliveries, TankId, Timestamp.
我插入了重复的记录,即相同的tankid,相同的交货以及+1天的偏移时间戳。
现在,我想删除时间戳较短的重复记录。
例如我在7月24日和25日为相同的tankid添加了重复的交货。我需要删除第24条记录。
我尝试了以下查询;
SELECT raw.TimeStamp,raw.[Delivery],raw.[TankId]
FROM [dbo].[tObservationData] raw
INNER JOIN (
SELECT [Delivery],[TankSystemId]
FROM [dbo].[ObservationData]
GROUP BY [Delivery],[TankSystemId]
HAVING COUNT([ObservationDataId]) > 1
) dup
ON raw.[Delivery] = dup.[Delivery] AND raw.[TankId] = dup.[TankId]
AND raw.TimeStamp >'2019-06-30 00:00:00.0000000' AND raw.[DeliveryL]>0
ORDER BY [TankSystemId],TimeStamp
但是上面也提供了其他记录,我如何查找和删除那些重复的记录?
答案 0 :(得分:1)
在这种情况下,可以使用order by子句进行分区。您可以按TankID和交货进行分区,也可以按时间戳按desc顺序进行排序
window()
在上面的代码中,rn = 1的记录将具有最新的时间戳。因此,您只能选择那些而忽略其他。您也可以使用相同的方法从表中删除/删除记录。
Select * from (
Select *,ROW_NUMBER() OVER (PARTITION BY TankID,Delievry ORDER BY [Timestamp] DESC) AS rn
from [dbo].[ObservationData]
)
where rn = 1
答案 1 :(得分:0)
您只是在寻找这个吗?
SELECT od.*
FROM (SELECT od.*,
ROW_NUMBER() OVER (PARTITION BY od.TankId, od.Delivery ORDER BY od.TimeStamp DESC) as seqnum
FROM [dbo].[tObservationData] od
) od
WHERE seqnum = 1;
答案 2 :(得分:0)
认为它将起作用
SELECT raw.TimeStamp,raw.[Delivery],raw.[TankId]
FROM [dbo].[tObservationData] raw
INNER JOIN (
SELECT [Delivery],[TankSystemId],min([TimeStamp]) as min_ts
FROM [dbo].[ObservationData]
GROUP BY [Delivery],[TankSystemId]
HAVING COUNT([ObservationDataId]) > 1
) dup
ON raw.[Delivery] = dup.[Delivery] AND raw.[TankId] = dup.[TankId] and raw.[TimeStamp] = dup.min_ts
AND raw.TimeStamp >'2019-06-30 00:00:00.0000000' AND raw.[DeliveryL]>0
ORDER BY [TankSystemId],TimeStamp