我正在尝试识别一系列数字(SQL Server)中的差距。我的情况如下......
ID Start End
1 1 4
2 1 6
3 2 4
4 8 10
5 13 14
Visual
-------------------------------
1-2-3-4
1-2-3-4-5-6
2-3-4
- -8-9-10
- - -13-14
这样做的结果可能是:
Table
-------------------------------
ID Start End Gap
4 8 10 -1
5 13 14 -2
最终,我希望有差距范围,但我应该能够从上面看出来......
Missing
7
11-12
我提出的解决方案要么太慢,要么不考虑范围内的重叠(例如ID 2)
CREATE TABLE #Docs (
[Rank] INT, --DENSE_RANK () OVER(ORDER BY BegProd)
ControlNumber BIGINT,
BegProd INT,
EndProd INT
)
SELECT
T1.ControlNumber,
T1.BegProd,
T1.EndProd,
MAX(T2.EndProd) AS [PreviousEndProd],
[Gap] = T1.BegProd - MAX(T2.EndProd) - 1
FROM #Docs T1
INNER JOIN #Docs T2
ON T1.[Rank] = T2.[Rank] + 1
AND T1.EndProd > T2.EndProd
GROUP BY T1.ControlNumber, T1.BegProd, T1.EndProd
HAVING T1.BegProd - MAX(T2.EndProd) > 1
此表中有超过200万行,范围跨度为1到10亿
修改的 修复了“遗失”表格。 间隙列表示在该起始编号之前有多少间隙。 (缺少#7是1号)
答案 0 :(得分:1)
试试这个:
create table #docs(id int, start int, [end] int)
insert #docs values(1,1,4),(2,1,6),(3,2,4),(4,8,10),(5,13,14)
;with a as
(
select start, dense_rank() over (order by start) rn
from #docs t where not exists (select 1 from #docs where t.start > start and t.start < [end])
group by start
), b as
(
select [end], dense_rank() over (order by [end]) rn
from #docs t where not exists (select 1 from #docs where t.[end] > start and t.[end] < [end])
group by [end]
)
select
case when a.[start]= b.[end]+2 then cast(a.start-1 as varchar(21))
else cast(b.[end]+1 as varchar(10)) +'-' + cast(a.start - 1 as varchar(10)) end missing
from a join b on a.rn - 1 = b.rn
and a.[start] <> b.[end] + 1
结果:
Missing
7
11-12