我在表格中有这种数据。
我需要以某种方式得到所有行而不是间隔为5分钟的行 为了减少它们的数量。
是否有可能以简单的方式使用T-SQL?
ID AtTime Speed Label
-------------------------------------------------------------------------
217 2017-06-06 08:01:01.000 0 Lat: 54.3956 Lon: 86.79349
217 2017-06-06 08:01:23.000 0 Lat: 54.3956 Lon: 86.7935
221 2017-06-06 08:04:20.000 0 Lat: 54.39548 Lon: 86.79372
217 2017-06-06 08:06:24.000 0 Lat: 54.39559 Lon: 86.79347
221 2017-06-06 08:09:21.000 0 Lat: 54.39548 Lon: 86.79372
217 2017-06-06 08:11:25.000 0 Lat: 54.3956 Lon: 86.79346
221 2017-06-06 08:12:21.000 0 Lat: 54.39526 Lon: 86.79405
221 2017-06-06 08:12:30.000 0 Lat: 54.39507 Lon: 86.79451
221 2017-06-06 08:12:36.000 14,4 Lat: 54.39503 Lon: 86.79493
221 2017-06-06 08:12:47.000 10,8 Lat: 54.39518 Lon: 86.79536
221 2017-06-06 08:12:56.000 7,2 Lat: 54.39527 Lon: 86.79578
221 2017-06-06 08:13:06.000 7,2 Lat: 54.39529 Lon: 86.79622
221 2017-06-06 08:14:10.000 0 Lat: 54.39545 Lon: 86.79621
答案 0 :(得分:2)
我希望这符合您的需求 - 我们首先创建一个CTE,用于识别数据集中的每个五分钟块。然后,我们使用第二个CTE来选择每个块中最早的行。我不确定ID
是否应该用作第二个分区标准,但我已经标记了可以添加的位置:
declare @t table (ID int not null, AtTime datetime not null, Speed decimal(9,4) not null,
Label varchar(29) not null)
insert into @t(ID,AtTime,Speed,Label) values
(217,'2017-06-06T08:01:01.000',0 ,'Lat: 54.3956 Lon: 86.79349 '),
(217,'2017-06-06T08:01:23.000',0 ,'Lat: 54.3956 Lon: 86.7935 '),
(221,'2017-06-06T08:04:20.000',0 ,'Lat: 54.39548 Lon: 86.79372'),
(217,'2017-06-06T08:06:24.000',0 ,'Lat: 54.39559 Lon: 86.79347'),
(221,'2017-06-06T08:09:21.000',0 ,'Lat: 54.39548 Lon: 86.79372'),
(217,'2017-06-06T08:11:25.000',0 ,'Lat: 54.3956 Lon: 86.79346 '),
(221,'2017-06-06T08:12:21.000',0 ,'Lat: 54.39526 Lon: 86.79405'),
(221,'2017-06-06T08:12:30.000',0 ,'Lat: 54.39507 Lon: 86.79451'),
(221,'2017-06-06T08:12:36.000',14.4,'Lat: 54.39503 Lon: 86.79493'),
(221,'2017-06-06T08:12:47.000',10.8,'Lat: 54.39518 Lon: 86.79536'),
(221,'2017-06-06T08:12:56.000',7.2 ,'Lat: 54.39527 Lon: 86.79578'),
(221,'2017-06-06T08:13:06.000',7.2 ,'Lat: 54.39529 Lon: 86.79622'),
(221,'2017-06-06T08:14:10.000',0 ,'Lat: 54.39545 Lon: 86.79621')
;With Times as (
select distinct u.StartBlock,DATEADD(minute,5,u.StartBlock) as EndBlock
from @t
cross apply
(select DATEADD(minute,((DATEDIFF(minute,0,AtTime)/5)*5),0) as StartBlock) u
), Ordered as (
select
*,
ROW_NUMBER() OVER (PARTITION BY StartBlock /* And ID? */ ORDER BY AtTime) as rn
from
@t t
inner join
Times tm
on
tm.StartBlock <= t.AtTime and
t.AtTime < tm.EndBlock
)
select *
from Ordered
where rn = 1
结果:
ID AtTime Speed Label StartBlock EndBlock rn
----------- ----------------------- ------- ----------------------------- ----------------------- ----------------------- --
217 2017-06-06 08:01:01.000 0.0000 Lat: 54.3956 Lon: 86.79349 2017-06-06 08:00:00.000 2017-06-06 08:05:00.000 1
217 2017-06-06 08:06:24.000 0.0000 Lat: 54.39559 Lon: 86.79347 2017-06-06 08:05:00.000 2017-06-06 08:10:00.000 1
217 2017-06-06 08:11:25.000 0.0000 Lat: 54.3956 Lon: 86.79346 2017-06-06 08:10:00.000 2017-06-06 08:15:00.000 1
请注意,这并不能保证所有行至少相隔5分钟。在病理情况下,您可能有两行,实际上只是间隔时间(例如,如果特定的5分钟间隔只有一行,并且它在该间隔的最后一个可能时刻,并且下一个间隔有一行这恰好发生在起点)。但是,对于正常的数据分布,数据平均相隔五分钟。