我有一个名为 table1 的SQL Server表,它有一个时间戳列 column_ts ,还有一些列说 column1,column2,column3
所以表格如下:
column_ts column1 column2 column3
2016-09-30 00:04:00.000 number1 string1 integer1
2016-09-30 00:24:00.000 number2 string2 integer2
2016-09-30 00:29:00.000 number3 string3 integer3
2016-09-30 00:44:00.000 number4 string4 integer4
2016-09-30 00:48:00.000 number5 string5 integer5
2016-09-30 01:04:00.000 number6 string6 integer6
2016-09-30 01:24:00.000 number7 string7 integer7
2016-09-30 01:54:00.000 number8 string8 integer8
2016-09-30 01:59:00.000 number9 string9 integer9
首先,我将选择记录where column_ts >= 2016-09-30 00:00:00.000
。然后,我想从 column_ts 的每个30分钟窗口中只选择一行具有最高时间戳的行。
因此,对于给定的数据,查询应仅选择以下行:
column_ts column1 column2 column3
2016-09-18 00:29:00.000 number3 string3 integer3
2016-09-18 00:48:00.000 number5 string5 integer5
2016-09-18 01:24:00.000 number7 string7 integer7
2016-09-18 01:59:00.000 number9 string9 integer9
在某种程度上,我想制作 column_ts 的30分钟窗口,如
1)2016-09-30 00:00:00.000 - 2016-09-30 00:30:00.000
2)2016-09-30 00:30:00.000 - 2016-09-30 01:00:00.000
3)2016-09-30 01:00:00.000 - 2016-09-30 01:30:00.000
4)2016-09-30 01:30:00.000 - 2016-09-30 02:00:00.000
最后想从这些30分钟的窗口中选择一行,其中 column_ts 的值最高。
我无法弄清楚如何生成30分钟的窗口,我可以从中选择MAX(column_ts)
。请建议我如何做到这一点。
答案 0 :(得分:3)
你可以从一个纪元中取出以分钟为单位的日期差异,然后将其除以30分组,间隔30分钟。
此查询将为每个30分钟的插槽以及该插槽的最大column_ts提供:
select dateadd(minute, datediff(minute, '1970-1-1',column_ts)/30*30,'1970-1-1') as timegroup,
MAX(column_ts) as max_time
from table1 where column_ts >= '2016-09-30 00:00:00.000'
group by datediff(minute, '1970-1-1', column_ts) / 30
以上产生:
timegroup max_time
2016-09-30 00:00:00.000 2016-09-30 00:29:00.000
2016-09-30 00:30:00.000 2016-09-30 00:48:00.000
2016-09-30 01:00:00.000 2016-09-30 01:24:00.000
2016-09-30 01:30:00.000 2016-09-30 01:59:00.000
完成后,您可以在子查询中使用它来获取您所追求的结果:
select groups.timegroup, t.column_ts, t.column1, t.column2, t.column3
from (
select dateadd(minute, datediff(minute, '1970-1-1',column_ts)/30*30,'1970-1-1') as timegroup,MAX(column_ts) as max_time
from table1 where column_ts >= '2016-09-30 00:00:00.000'
group by datediff(minute, '1970-1-1', column_ts) / 30
) as groups
inner join table1 t on t.column_ts = groups.max_time
哪个产生
timegroup column_ts column1 column2 column3
2016-09-30 00:00:00.000 2016-09-30 00:29:00.000 number3 string3 integer3
2016-09-30 00:30:00.000 2016-09-30 00:48:00.000 number5 string5 integer5
2016-09-30 01:00:00.000 2016-09-30 01:24:00.000 number7 string7 integer7
2016-09-30 01:30:00.000 2016-09-30 01:59:00.000 number9 string9 integer9
答案 1 :(得分:2)
假设您使用的是sql server 2005+,这是脚本
use tempdb
--drop table dbo.t
create table dbo.t (column_ts datetime, column1 varchar(30), column2 varchar(30), column3 varchar(30));
go
-- populate the table
insert into dbo.t (column_ts, column1, column2, column3)
select '2016-09-30 00:04:00.000','number1','string1','integer1'
union all select '2016-09-30 00:24:00.000','number2','string2','integer2'
union all select '2016-09-30 00:29:00.000','number3','string3','integer3'
union all select '2016-09-30 00:44:00.000','number4','string4','integer4'
union all select '2016-09-30 00:48:00.000','number5','string5','integer5'
union all select '2016-09-30 01:04:00.000','number6','string6','integer6'
union all select '2016-09-30 01:24:00.000','number7','string7','integer7'
union all select '2016-09-30 01:54:00.000','number8','string8','integer8'
union all select '2016-09-30 01:59:00.000','number9','string9','integer9';
go
-- the query
; with c as (
select section=datediff(minute, '2016-09-30', column_ts)/30, * from dbo.t
)
, c2 as (select rnk=rank() over (partition by section order by column_ts desc), * from c)
select column_ts, column1, column2, column3
from c2
where rnk = 1;
在我收集性能跟踪后,我需要在每30分钟窗口找到最昂贵的查询之前,我做了类似的事情。
答案 2 :(得分:1)
我会生成一个间隔表,并将其连接到您的数据。然后为row_number()
按照降序排列的每个区间添加column_ts
,仅返回最高值(RN = 1)。
DECLARE @Test TABLE (column_ts datetime, column1 varchar(50), column2 varchar(50), column3 varchar(50))
INSERT INTO @Test
VALUES ('2016-09-30 00:04:00.000','number1','string1','integer1'),
('2016-09-30 00:24:00.000','number2','string2','integer2'),
('2016-09-30 00:29:00.000','number3','string3','integer3'),
('2016-09-30 00:44:00.000','number4','string4','integer4'),
('2016-09-30 00:48:00.000','number5','string5','integer5'),
('2016-09-30 01:04:00.000','number6','string6','integer6'),
('2016-09-30 01:24:00.000','number7','string7','integer7'),
('2016-09-30 01:54:00.000','number8','string8','integer8'),
('2016-09-30 01:59:00.000','number9','string9','integer9')
DECLARE @TimeGrid TABLE (IntervalStart TIME, IntervalEnd TIME)
DECLARE @MyTime TIME, @true BIT=1
WHILE @true=1
BEGIN
IF @MyTime IS NULL SET @MyTime = CONVERT(TIME,'00:00:00')
INSERT INTO @TimeGrid (IntervalStart,IntervalEnd)
SELECT @MyTime, DATEADD(NS,-100,DATEADD(MI,30,@MyTime))
SET @MyTime=DATEADD(MI,30,@MyTime)
IF @MyTime= CONVERT(TIME,'00:00:00')
SET @true=0
END
;WITH X AS
(
SELECT *
FROM @Test T
JOIN @TimeGrid TG ON CONVERT(TIME,T.column_ts) BETWEEN TG.IntervalStart AND TG.IntervalEnd
), Y AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY IntervalStart ORDER BY column_ts DESC) AS RN
FROM X
)
SELECT column_ts, column1, column2, column3--, IntervalStart, IntervalEnd, RN
FROM Y
WHERE RN=1
答案 3 :(得分:1)
;WITH cte AS (
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY
CASE
WHEN DATEPART(MINUTE,column_ts) > 30 THEN DATEADD(MINUTE,30 - DATEPART(MINUTE,column_ts),column_ts)
ELSE DATEADD(MINUTE,- DATEPART(MINUTE,column_ts),column_ts)
END
ORDER BY column_ts DESC) as RowNumber
FROM
@Table1
)
SELECT *
FROM
cte
WHERE
RowNumber = 1
您可以像其他人一样显示每30分钟生成一张表格,但实际情况是,如果不到30分钟,您只需要向下舍入到小时标记,如果超过30分钟,则需要舍入到30分钟。这将创建分组。所以不需要递归cte。
CASE
WHEN DATEPART(MINUTE,column_ts) => 30 THEN DATEADD(MINUTE,30 - DATEPART(MINUTE,column_ts),column_ts)
ELSE DATEADD(MINUTE,- DATEPART(MINUTE,column_ts),column_ts)
END as HalfHourGroup
答案 4 :(得分:1)
@ petelids的答案看起来对我而言,但我会提供一种在计算中不使用文字日期的替代方案。我想你甚至可能认为它看起来更清晰一些。根据您的样本数据我假设您没有存储秒数。您也可以通过一些格式化选项忽略输出中的秒数。对于<title>Test Site</title>
<body>
<div id="headerpanel">
TEST
</div>
</body>
,无论如何,秒数都无关紧要。
span {
display: inline-block;
font-weight: bold;
margin-right: 6px;
vertical-align:middle; /* added */
}
ul {
display: inline-block;
list-style: none;
list-style-type: none;
margin: 0;
padding: 0;
vertical-align:middle; /* added */
}
ul li {
display: inline-block;
list-style: none;
list-style-type: none;
margin: 0;
padding: 4px;
}
ul li {
font-size: 2.0rem;
}
修改强>
在重新阅读您的问题后,我意识到您希望整行作为结果。你仍然可以使用这种方法,尽管group by
技术现在可能更常见并且可能非常快。
select
dateadd
minute,
-datepart(minute, min(column_ts)) % 30,
min(column_ts)
) as timegroup,
max(column_ts) as max_time_in_window
from T
group by
cast(column_ts as date),
datepart(hour, column_ts),
datepart(minute, column_ts) / 30;
或使用row_number()
:
select * from T
where column_ts in (
select max(column_ts) as max_time_in_window
from T
group by
cast(column_ts as date),
datepart(hour, column_ts),
datepart(minute, column_ts) / 30
);
答案 5 :(得分:0)
可以在没有窗口函数的情况下完成:
select max(column_ts) column_ts, column1, column2, column3
from mytable
where column_ts >= 2016-09-30 00:00:00.000
group by column1, column2, column3
要在多个时间段内获得结果,请按括号分组:
select max(column_ts) column_ts, column1, column2, column3
from mytable
group by column1, column2, column3, <expression to calculate a unique value for each column_ts bracket>
答案 6 :(得分:0)
我通过分别生成“间隔”表作为CTE来做到这一点。如果你这么做很多,你可能想要“保持”表中的间隔,以便你可以加入它们。当有两个具有相同时间戳的事件时,您还应该考虑一下您想要发生什么......
{{1}}
(警告:明天剧本可能无效......)