查找缺失记录并返回相邻记录SQL

时间:2017-09-29 19:49:18

标签: sql sql-server

我需要将相邻记录返回到本机SQL中序列中的缺失记录。如果缺少序列中的第一个条目,则仅返回下一个条目。无需在序列中查找缺失的结尾。

这在SQL Server 12.0.2000.8

中运行

相关列的结构:

BatchId(nvarchar(50), null) 
CreateDate(datetime, null)
UserId(varchar(50), null) 
Batch(varchar(50), null)

"之后的BatchId中的最后一个数字。 - "确定顺序。 BatchId与Batch相关联。当批次更改时,BatchId上的序列应重置为1.

BatchId         CreateDate              UserId      Batch
#########################################################
9K182855 - 1    2017-09-26 17:57:20.977 9K182855    8
9K182855 - 2    2017-09-26 18:20:57.693 9K182855    8
9K182855 - 1    2017-09-27 11:04:46.177 9K182855    9
9K182855 - 2    2017-09-27 11:19:32.990 9K182855    9

我用来获取数据的查询

select BatchID, CreateDate, UserId, Batch from Results
where CreateDate > dateadd(day,-2,getdate())
and Batch between 0 and 9
order by UserId, CreateDate, Batch;

这是 GOOD 数据

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 1    2017-09-28 14:04:46.177 4L182855    9
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 3    2017-09-28 16:30:27.953 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 5    2017-09-28 18:20:57.693 4L182855    9
4L182855 - 1    2017-09-29 11:04:46.177 4L182855    0
4L182855 - 2    2017-09-29 11:19:32.990 4L182855    0
4L182855 - 3    2017-09-29 11:30:27.953 4L182855    0
4L182855 - 4    2017-09-29 11:57:20.977 4L182855    0
4L182855 - 5    2017-09-29 12:00:57.693 4L182855    0
4L182855 - 6    2017-09-29 12:04:46.177 4L182855    0
4L182855 - 7    2017-09-29 12:19:32.990 4L182855    0
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 9    2017-09-29 13:57:20.977 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

这是 MISSING 数据

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 1    2017-09-28 14:04:46.177 4L182855    9
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 5    2017-09-28 18:20:57.693 4L182855    9
4L182855 - 1    2017-09-29 11:04:46.177 4L182855    0
4L182855 - 2    2017-09-29 11:19:32.990 4L182855    0
4L182855 - 3    2017-09-29 11:30:27.953 4L182855    0
4L182855 - 4    2017-09-29 11:57:20.977 4L182855    0
4L182855 - 5    2017-09-29 12:00:57.693 4L182855    0
4L182855 - 6    2017-09-29 12:04:46.177 4L182855    0
4L182855 - 7    2017-09-29 12:19:32.990 4L182855    0
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

要求是返回下面的行,它们与丢失的记录相邻

BatchId         CreateDate              UserId      Batch
#########################################################
4L182855 - 2    2017-09-28 15:19:32.990 4L182855    9
4L182855 - 4    2017-09-28 17:57:20.977 4L182855    9
4L182855 - 8    2017-09-29 12:30:27.953 4L182855    0
4L182855 - 10   2017-09-29 14:20:57.693 4L182855    0

我可以在Python中执行此操作,也可以通过CLR用户定义函数执行此操作。但是,我不确定它在本机SQL中是否可行。如果可以,请赐教。

1 个答案:

答案 0 :(得分:5)

使用stuff()截断batchid以获取批处理序列,使用lead()lag()来获取计算的{{前一行和下一行的值1}}:

BatchSeq

rextester演示:http://rextester.com/ZCBLP37968

返回:

select s.BatchId, s.CreateDate, s.UserId, s.Batch
from (
  select t.*
    , PrevSeq = lag(x.BatchSeq)  over (partition by Batch order by CreateDate)
    , x.BatchSeq
    , NextSeq = lead(x.BatchSeq) over (order by CreateDate)
  from results t
    cross apply (values (convert(int,stuff(t.batchid,1,charindex('- ',t.batchid)+1,'')))
      ) x (BatchSeq)
  ) s
where BatchSeq - isnull(PrevSeq,0) != 1 
  or (BatchSeq - NextSeq !=-1 and NextSeq != 1)
order by createdate

这也适用于丢失的第一条记录:http://rextester.com/BLAD55913