我有以下查询根据RegNumber列值检测重复项,如果不同的行输入日期相差不到10分钟,查询将保留具有最高Confidence列值的那个。
SELECT *,
CASE
WHEN conf_max = confidence THEN 'Conf_Max'
ELSE 'Duplicate'
END AS Is_Conf_Max
FROM (SELECT *,
Max(confidence)
OVER (
partition BY regnumber) AS Conf_Max
FROM (SELECT id,
cameraid,
dateseen,
nationality,
regnumber,
confidence,
Min(dateseen)
OVER (
partition BY regnumber) AS DateSeen_Min,
Max(dateseen)
OVER (
partition BY regnumber) AS DateSeen_Max
FROM plate_read
WHERE ( cameraid IN ( 5, 6 ) )) A
WHERE Abs(Datediff(minute, dateseen_max, dateseen_min)) <= 10) B
WHERE conf_max <> confidence
ORDER BY regnumber
但是问题如下:这给了我所有重复项,其中DateSeen列的差异小于10分钟。但是,如果我有另一组重复项超过10分钟且具有相同的RegNumber,则不会检测到这些示例如下:
ID CamId DateSeen Nationality Reg Conf
-- ----- ------- ---------- --- ---
80 5 20/12/2013 12:10:57 E 5897HHS 94
81 5 20/12/2013 12:15:03 E 5897HHS 93
82 5 20/12/2013 12:16:17 GBZ G6746D 98
83 5 20/12/2013 12:35:57 E 5897HHS 88
84 5 20/12/2013 12:36:03 E 5897HHS 86
根据以上数据,只有ID 80,82和83有效,因为81是80的重复,84是83的重复。希望有人可以协助这个吗?
答案 0 :(得分:0)
这可能不是一个完整的答案,但为什么您的查询需要如此复杂?为什么不简化它(为清楚起见省略了额外的标准):
select *
from plate_read pr1
where conf = (
select max (conf)
from plate_read pr2
where pr1.reg = pr2.reg
and abs(datediff(minute,pr1.dateseen,pr2.dateseen)) < 11
)