我有一个包含许多列的表,我必须根据这两列进行选择:
TIME ID
-216 AZA
215 AZA
56 EA
-55 EA
66 EA
-03 AR
03 OUI
-999 OP
999 OP
04 AR
87 AR
预期输出
TIME ID
66 EA
03 OUI
87 AR
我需要选择没有匹配的行。有些行具有相同的ID,几乎相同的时间但是反转有点差异。例如,TIME -216的第一行与时间215的第二个记录匹配。我试图以多种方式解决它,但每次我发现自己迷失了。
答案 0 :(得分:2)
第一步 - 查找具有重复ID的行。第二步 - 过滤近似重复的行。
第一步:
SELECT t1.TIME, t2.TIME, t1.ID FROM mytable t1 JOIN mytable
t2 ON t1.ID = t2.ID AND t1.TIME > t2.TIME;
join子句的第二部分确保我们只为每对获得一条记录。
第二步:
SELECT t1.TIME,t2.TIME,t1.ID FROM mytable t1 JOIN mytable t2 ON t1.ID = t2.ID AND
t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3;
如果例如,这将产生一些重复的结果。 (10, FI), (-10, FI) and (11, FI)
在您的表中,因为有两个有效对。你可以按如下方式过滤掉这些:
SELECT t1.TIME,MAX(t2.TIME),t1.ID FROM mytable t1 JOIN mytable t2 ON
t1.ID = t2.ID AND t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3 GROUP BY
t1.TIME,t1.ID;
但目前还不清楚你想要放弃哪种结果。希望这会指出你正确的方向!
答案 1 :(得分:0)
这有帮助吗?
create table #RawData
(
[Time] int,
ID varchar(3)
)
insert into #rawdata ([time],ID)
select -216, 'AZA'
union
select 215, 'AZA'
union
select 56, 'EA'
union
select -55, 'EA'
union
select 66, 'EA'
union
select -03, 'AR'
union
select 03, 'OUI'
union
select -999, 'OP'
union
select 999, 'OP'
union
select 04, 'AR'
union
select 87, 'AR'
union
-- this value added to illustrate that the algorithm does not ignore this value
select 156, 'EA'
--create a copy with an ID to help out
create table #Data
(
uniqueId uniqueidentifier,
[Time] int,
ID varchar(3)
)
insert into #Data(uniqueId,[Time],ID) select newid(),[Time],ID from #RawData
declare @allowedDifference int
select @allowedDifference = 1
--find duplicates with matching inverse time
select *, d1.Time + d2.Time as pairDifference from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=@allowedDifference and d1.[Time] + d2.[Time] >= (-1 * @allowedDifference))
-- now find all ID's ignoring these pairs
select [Time],ID from #data
where uniqueID not in (select d1.uniqueID from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=3 and d1.[Time] + d2.[Time] >= -3))