你好StackOverflowers
我真的很难找到正确的方法(在我看来)相对简单的Gap-finding问题。 我有一个包含每小时日期时间的表(每小时日志文件导入到DB)。 我需要在一段时间内找到缺少的时间(比如说4月)。 想象一下,在DB表[imported_logs]
中有以下数据[2018-04-02 10:00:000]
[2018-04-02 11:00:000]
[2018-04-02 12:00:000]
[2018-04-02 17:00:000]
我需要4月差距分析的结果:
[ GAP-BEGIN ] [ GAB_END ]
[2018-04-01 00:00:000] [2018-04-02 10:00:000] <-- problem
[2018-04-02 13:00:000] [2018-04-02 17:00:000] <-- can be found using below code
[2018-04-02 18:00:000] [2018-05-01 00:00:000] <-- problem
我的问题主要是找到开始和结束范围,但以下代码有助于在可用数据之间找到上限。
WITH t AS (
SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY zone ORDER BY hourImported)
FROM logsImportedTable
Where hourImported > '2018-04-01' and hourImported < '2018-05-01' and zone = 1
)
SELECT t1.zone, t1.hourImported as GapStart, t2.hourImported as GapEnd
FROM t t1
INNER JOIN t t2 ON t2.zone = t1.zone AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.hourImported, t2.hourImported) > 60
只给出了结果:
[zone] [gap_start ] [gap_end ]
[1 ] [2018-04-02 13:00:00.000] [2018-04-02 17:00:00.000]
所以基本上如果在四月期间没有导入任何日志,那么所有当前的实现都会显示没有丢失数据(有点错误)
我想我需要以某种方式在4月开始和结束之前添加一些新的数据点,以某种方式让查询捕获月份的开始和结束作为缺失的数据? 你们聪明的家伙/女孩会做什么?
/亲切的问候
答案 0 :(得分:0)
对于这种情况,只需添加初始值和结束值(如果适用):
Users
答案 1 :(得分:0)
好的,在@Gordon的帮助下,这是我对问题的最终解决方案。即使整个月的数据缺失以及内部的所有小差距,它也会产生差距。
DECLARE @zone INT = 1, @currentPeriodStart DATETIME = '2018-01-01',
@currentPeriodEnd DATETIME = '2018-02-01';
WITH t AS (
SELECT *, rn = ROW_NUMBER() over (PARTITION BY zone_id ORDER BY
time_of_file_present)
FROM test
Where time_of_file_present > @currentPeriodStart and time_of_file_present <
@currentPeriodEnd and zone_id = @zone
)
SELECT t1.zone_id, t1.time_of_file_present as gap_start,
t2.time_of_file_present as gap_end
FROM t t1
INNER JOIN t t2 ON t2.zone_id = t1.zone_id AND t2.rn = t1.rn + 1
WHERE DATEDIFF(MINUTE, t1.time_of_file_present, t2.time_of_file_present) >60
union all
select @zone, @currentPeriodStart, min(lit.time_of_file_present)
from test lit
where lit.time_of_file_present >= @currentPeriodStart
having min(lit.time_of_file_present) > @currentPeriodStart and
min(lit.time_of_file_present) < @currentPeriodEnd
union all
select @zone,max(lit.time_of_file_present), @currentPeriodEnd
from test lit
where lit.time_of_file_present <= @currentPeriodEnd
having max(lit.time_of_file_present) < @currentPeriodEnd and
max(lit.time_of_file_present) > @currentPeriodStart
union all
select @zone,@currentPeriodStart, @currentPeriodEnd
from test lit
having max(lit.time_of_file_present) < @currentPeriodStart or
max(lit.time_of_file_present) > @currentPeriodEnd
order by gap_start