每小时数据的每小时差距

时间:2018-06-04 11:51:15

标签: sql sql-server gaps-and-islands

你好StackOverflowers

我真的很难找到正确的方法(在我看来)相对简单的Gap-finding问题。 我有一个包含每小时日期时间的表(每小时日志文件导入到DB)。 我需要在一段时间内找到缺少的时间(比如说4月)。 想象一下,在DB表[imported_logs]

中有以下数据
[2018-04-02 10:00:000]
[2018-04-02 11:00:000]
[2018-04-02 12:00:000] 
[2018-04-02 17:00:000]

我需要4月差距分析的结果:

[      GAP-BEGIN     ]  [     GAB_END        ]
[2018-04-01 00:00:000]  [2018-04-02 10:00:000] <-- problem
[2018-04-02 13:00:000]  [2018-04-02 17:00:000] <-- can be found using below code
[2018-04-02 18:00:000]  [2018-05-01 00:00:000] <-- problem

我的问题主要是找到开始和结束范围,但以下代码有助于在可用数据之间找到上限。

    WITH t AS (
      SELECT  *, rn = ROW_NUMBER() OVER (PARTITION BY zone ORDER BY hourImported)
      FROM  logsImportedTable
      Where hourImported > '2018-04-01' and hourImported < '2018-05-01' and zone = 1
    )  
    SELECT  t1.zone, t1.hourImported as GapStart, t2.hourImported as GapEnd
    FROM    t t1
    INNER JOIN t t2 ON t2.zone = t1.zone AND t2.rn = t1.rn + 1
    WHERE   DATEDIFF(MINUTE, t1.hourImported, t2.hourImported) > 60

只给出了结果:

  [zone] [gap_start              ] [gap_end                ]
  [1   ] [2018-04-02 13:00:00.000] [2018-04-02 17:00:00.000]

所以基本上如果在四月期间没有导入任何日志,那么所有当前的实现都会显示没有丢失数据(有点错误)

我想我需要以某种方式在4月开始和结束之前添加一些新的数据点,以某种方式让查询捕获月份的开始和结束作为缺失的数据? 你们聪明的家伙/女孩会做什么?

/亲切的问候

2 个答案:

答案 0 :(得分:0)

对于这种情况,只需添加初始值和结束值(如果适用):

Users

答案 1 :(得分:0)

好的,在@Gordon的帮助下,这是我对问题的最终解决方案。即使整个月的数据缺失以及内部的所有小差距,它也会产生差距。

DECLARE @zone INT = 1, @currentPeriodStart DATETIME = '2018-01-01', 
@currentPeriodEnd DATETIME = '2018-02-01';

WITH t AS (
SELECT  *, rn = ROW_NUMBER() over (PARTITION BY zone_id ORDER BY 
time_of_file_present)
FROM  test
Where time_of_file_present > @currentPeriodStart and time_of_file_present < 
@currentPeriodEnd and zone_id = @zone
)  
SELECT  t1.zone_id, t1.time_of_file_present as gap_start, 
t2.time_of_file_present as gap_end
FROM    t t1
    INNER JOIN t t2 ON t2.zone_id = t1.zone_id AND t2.rn = t1.rn + 1
WHERE   DATEDIFF(MINUTE, t1.time_of_file_present, t2.time_of_file_present) >60 

union all
select @zone, @currentPeriodStart, min(lit.time_of_file_present)
from test lit
where lit.time_of_file_present >=  @currentPeriodStart
having min(lit.time_of_file_present) >  @currentPeriodStart and 
min(lit.time_of_file_present) < @currentPeriodEnd

union all
select @zone,max(lit.time_of_file_present), @currentPeriodEnd
from test lit
where lit.time_of_file_present <= @currentPeriodEnd
having max(lit.time_of_file_present) < @currentPeriodEnd and 
max(lit.time_of_file_present) > @currentPeriodStart

union all
select @zone,@currentPeriodStart, @currentPeriodEnd
from test lit
having max(lit.time_of_file_present) < @currentPeriodStart or 
max(lit.time_of_file_present) > @currentPeriodEnd
order by gap_start