SQL Server将缺少的数据值添加到间隔查询

时间:2018-06-18 17:19:46

标签: sql-server intervals

我目前正在努力完成报告要求,并坚持从我所处的最佳方式着手。

我有一个表,用于捕获每15分钟间隔的多个请求的间隔数据以及正在进行的请求类型的标识符。

另一个表包含请求名称的详细信息。

如果在某个时间间隔内发出特定请求,则仅从源应用程序写入数据。结果是,对于给定类型的请求,间隔表数据中存在间隙。我无法控制源应用程序,无法将数据插入间隔表。一切都必须在下游进行整理。

在输出端,我需要生成缺少的数据。目前这是手动完成的,有一些Excel魔法和每周几个小时,但我觉得我也非常接近解决SQL方面的问题,我只是错过了最后一跳。 (我认为)

以下查询代表数据结构。

我在范围内生成间隔,然后将其与两个数据表连接起来。这更接近我想要做的事情,但我不确定如何恰当地说明两个表缺少值。

我想让输出看起来像:

interval               |CommonName      |aCounter
=======================|================|===============
all intervals          |each CommonName | value -or- 0
2018-06-17 14:00:00.000|NameOne         | 0
2018-06-17 14:00:00.000|NameTwo         | 0
2018-06-17 14:15:00.000|NameOne         | 1
2018-06-17 14:15:00.000|NameTwo         | 2
2018-06-17 14:30:00.000|NameOne         | 3
2018-06-17 14:30:00.000|NameTwo         | 0
2018-06-17 14:45:00.000|NameOne         | 0
2018-06-17 14:45:00.000|NameTwo         | 0

使我最接近此输出的查询是:

declare @DateFrom datetime = '2018-06-17 14:00:00'
declare @DateTo datetime = '2018-06-17 15:00:00'
declare @Incr int = 15
declare @values table (interval datetime)
declare @dataOne table (interval datetime, identifier varchar(4), aCounter int)
declare @dataTwo table (identifier varchar(4), commonName varchar(50))

--populate the @values table
Begin
    With DateTable As (
        Select DateFrom = @DateFrom
        Union All
        Select DateAdd(MI, @Incr, df.DateFrom)
        From DateTable df
        Where df.DateFrom < @DateTo
    )
    Insert into @values(interval) Select DateFrom From DateTable option (maxrecursion 32767)
End

--populate the @dataOne table
insert into @dataOne values ('2018-06-17 14:15:00.000','500',1)
insert into @dataOne values ('2018-06-17 14:15:00.000','501',2)
insert into @dataOne values ('2018-06-17 14:30:00.000','500',3)
insert into @dataOne values ('2018-06-17 14:30:00.000','502',4)

--populate the @dataTwo table
insert into @dataTwo values ('500', 'NameOne')
insert into @dataTwo values ('501', 'NameTwo')
insert into @dataTwo values ('502', 'NameThree')

select vals.interval
    ,IsNull(dt.commonName,'none') as CommonName
    ,IsNull(do.aCounter,0)  as aCounter
from @values vals
    left join @dataOne do
        on vals.interval = do.interval
        and do.identifier in ('500','501')
    left join @dataTwo dt
        on do.identifier = dt.identifier

但这会产生如下输出:

interval               |CommonName      |aCounter
=======================|================|===============
2018-06-17 14:00:00.000|none            |0
2018-06-17 14:15:00.000|NameOne         |1
2018-06-17 14:15:00.000|NameTwo         |2
2018-06-17 14:30:00.000|NameOne         |3
2018-06-17 14:45:00.000|none            |0
2018-06-17 15:00:00.000|none            |0

哪个更接近但不是我要去的地方。

有人可以建议更好的替代方案吗?

提前致谢。

编辑#1 我意识到我原来的帖子可能已经遗漏了细节。下面的交叉连接解决方​​案考虑了缺失的时间间隔,我希望考虑到dataTwo中缺少的间隔和缺失值。如果我修改由@Aaron Dietz提供的解决方案的数据,它突出了我想说的话。 dataTwo中有一条没有相应dataOne记录的记录。我在生成现有数据的过程中翻了一些连接,但这并没有产生正确的结果,所以我备份了。

我还将Temp Table更改为表var,因为我必须在SSMS中运行所有内容,并且无法触及任何模式,甚至临时表。我意识到这会大规模地暴露潜在的执行/性能问题,但这是我必须要处理的事情。

declare @dataOne table (interval datetime, identifier varchar(4), aCounter int)
declare @dataTwo table (identifier varchar(4), commonName varchar(50))

insert into @dataOne values ('2018-06-17 14:15:00.000','500',1)
insert into @dataOne values ('2018-06-17 14:15:00.000','501',2)
insert into @dataOne values ('2018-06-17 14:30:00.000','500',3)
insert into @dataOne values ('2018-06-17 14:30:00.000','502',4)

insert into @dataTwo values ('500', 'NameOne')
insert into @dataTwo values ('501', 'NameTwo')
insert into @dataTwo values ('502', 'NameThree')
insert into @dataTwo values ('503', 'NameFour')

--Create a temp table to store processed output records
declare @Final table (interval datetime, CommonName varchar(50), aCounter int)

--Populate #Final with interval records that already exist
INSERT INTO @Final (interval, CommonName, aCounter)
SELECT d.interval, d2.commonName, d.aCounter
FROM @dataOne d
JOIN @dataTwo d2 on d.identifier = d2.identifier
WHERE d.identifier IN ('500','501','502','503')

--Set beginning and end intervals
DECLARE @Start datetime = '2018-06-17 14:00:00.000'
DECLARE @End datetime = '2018-06-17 15:00:00.000'

--Loop through intervals and insert missing records
WHILE (@Start <= @End)
BEGIN
    INSERT INTO @Final (interval, CommonName, aCounter)
    SELECT @Start, CommonName, 0
    FROM (SELECT @Start interval) A
    CROSS JOIN (SELECT DISTINCT CommonName FROM @Final) B
    WHERE NOT EXISTS (SELECT *
                      FROM @Final F
                      WHERE F.interval = A.interval
                      AND F.CommonName = B.CommonName)

SET @Start = DATEADD(MINUTE, 15, @Start)
END

--Final output
SELECT *
FROM @Final
ORDER BY interval, CommonName

编辑#2 对于上面编辑#1中的澄清信息,我试图让输出看起来像:

interval               |CommonName      |aCounter
=======================|================|===============
all intervals          |each CommonName | value -or- 0
2018-06-17 14:00:00.000|NameOne         | 0
2018-06-17 14:00:00.000|NameTwo         | 0
2018-06-17 14:00:00.000|NameThree       | 0
2018-06-17 14:00:00.000|NameFour        | 0
2018-06-17 14:15:00.000|NameOne         | 1
2018-06-17 14:15:00.000|NameTwo         | 2
2018-06-17 14:15:00.000|NameThree       | 0
2018-06-17 14:15:00.000|NameFour        | 0
2018-06-17 14:30:00.000|NameOne         | 3
2018-06-17 14:30:00.000|NameTwo         | 0
2018-06-17 14:30:00.000|NameThree       | 4
2018-06-17 14:30:00.000|NameFour        | 0
2018-06-17 14:45:00.000|NameOne         | 0
2018-06-17 14:45:00.000|NameTwo         | 0
2018-06-17 14:45:00.000|NameThree       | 0
2018-06-17 14:45:00.000|NameFour        | 0

&#34; NameFour&#34;相关的时间间隔不存在。所以它应该为每个间隔显示零。

认为以下查询的工作方式是更改“已存在”中的联接&#39;查询临时表并将标识符过滤器移动到最外面的查询。

--Set beginning and end intervals
DECLARE @Start datetime = '2018-06-17 14:00:00.000'
DECLARE @End datetime = '2018-06-17 15:00:00.000'

declare @dataOne table (interval datetime, identifier varchar(4), aCounter int)
declare @dataTwo table (identifier varchar(4), commonName varchar(50))

insert into @dataOne values ('2018-06-17 14:15:00.000','500',1)
insert into @dataOne values ('2018-06-17 14:15:00.000','501',2)
insert into @dataOne values ('2018-06-17 14:30:00.000','500',3)
insert into @dataOne values ('2018-06-17 14:30:00.000','502',4)
insert into @dataOne values ('2018-06-17 15:30:00.000','502',4)

insert into @dataTwo values ('500', 'NameOne')
insert into @dataTwo values ('501', 'NameTwo')
insert into @dataTwo values ('502', 'NameThree')
insert into @dataTwo values ('503', 'NameFour')

--Create a temp table to store processed output records
declare @Final table (interval datetime, CommonName varchar(50), identifier varchar(4), aCounter int)

--Populate #Final with interval records that already exist
INSERT INTO @Final (interval, CommonName, identifier, aCounter)
SELECT ISNULL(d.interval,@Start), d2.commonName, d2.identifier, ISNULL(d.aCounter,0)
FROM @dataOne d
RIGHT OUTER JOIN @dataTwo d2 
    on d.identifier = d2.identifier
    and d.interval>=@Start and d.interval<=@End
ORDER BY ISNULL(d.interval,@Start), d2.commonName, ISNULL(d.aCounter,0)

--Loop through intervals and insert missing records
WHILE (@Start <= @End)
BEGIN
    INSERT INTO @Final (interval, CommonName, identifier, aCounter)
    SELECT @Start, CommonName, identifier, 0
    FROM (SELECT @Start interval) A
    CROSS JOIN (SELECT DISTINCT CommonName, identifier FROM @Final) B
    WHERE NOT EXISTS (SELECT *
                      FROM @Final F
                      WHERE F.interval = A.interval
                      AND F.CommonName = B.CommonName)

SET @Start = DATEADD(MINUTE, 15, @Start)
END

--Final output
SELECT *
FROM @Final
WHERE identifier in ('500','501','502','503')
ORDER BY interval, CommonName

1 个答案:

答案 0 :(得分:0)

一种方法是插入存在的记录,然后使用循环插入缺失的间隔记录:

--Set beginning and end intervals
DECLARE @Start datetime = '2018-06-17 14:00:00.000'
DECLARE @End datetime = '2018-06-17 15:00:00.000'

declare @dataOne table (interval datetime, identifier varchar(4), aCounter int)
declare @dataTwo table (identifier varchar(4), commonName varchar(50))

insert into @dataOne values ('2018-06-17 14:15:00.000','500',1)
insert into @dataOne values ('2018-06-17 14:15:00.000','501',2)
insert into @dataOne values ('2018-06-17 14:30:00.000','500',3)
insert into @dataOne values ('2018-06-17 14:30:00.000','502',4)
insert into @dataOne values ('2018-06-17 15:30:00.000','502',4)

insert into @dataTwo values ('500', 'NameOne')
insert into @dataTwo values ('501', 'NameTwo')
insert into @dataTwo values ('502', 'NameThree')
insert into @dataTwo values ('503', 'NameFour')

--Create a temp table to store processed output records
declare @Final table (interval datetime, CommonName varchar(50), identifier varchar(4), aCounter int)

--Populate #Final with interval records that already exist
INSERT INTO @Final (interval, CommonName, identifier, aCounter)
SELECT ISNULL(d.interval,@Start), d2.commonName, d2.identifier, ISNULL(d.aCounter,0)
FROM @dataOne d
RIGHT OUTER JOIN @dataTwo d2 
    on d.identifier = d2.identifier
    and d.interval>=@Start and d.interval<=@End
WHERE d.identifier in ('500','501','502','503')
--ORDER BY ISNULL(d.interval,@Start), d2.commonName, ISNULL(d.aCounter,0)

--Loop through intervals and insert missing records
WHILE (@Start <= @End)
BEGIN
    INSERT INTO @Final (interval, CommonName, identifier, aCounter)
    SELECT @Start, CommonName, identifier, 0
    FROM (SELECT @Start interval) A
    CROSS JOIN (SELECT DISTINCT CommonName, identifier FROM @dataTwo) B
    WHERE NOT EXISTS (SELECT *
                      FROM @Final F
                      WHERE F.interval = A.interval
                      AND F.CommonName = B.CommonName)

SET @Start = DATEADD(MINUTE, 15, @Start)
END

--Final output
SELECT *
FROM @Final
ORDER BY interval, identifier