我可以使用传统的子查询方法来计算最近十分钟内的事件。例如,这个:
drop table if exists [dbo].[readings]
go
create table [dbo].[readings](
[server] [int] NOT NULL,
[sampled] [datetime] NOT NULL
)
go
insert into readings
values
(1,'20170101 08:00'),
(1,'20170101 08:02'),
(1,'20170101 08:05'),
(1,'20170101 08:30'),
(1,'20170101 08:31'),
(1,'20170101 08:37'),
(1,'20170101 08:40'),
(1,'20170101 08:41'),
(1,'20170101 09:07'),
(1,'20170101 09:08'),
(1,'20170101 09:09'),
(1,'20170101 09:11')
go
-- Count in the last 10 minutes - example periods 08:31 to 08:40, 09:12 to 09:21
select server,sampled,(select count(*) from readings r2 where r2.server=r1.server and r2.sampled <= r1.sampled and r2.sampled > dateadd(minute,-10,r1.sampled)) as countinlast10minutes
from readings r1
order by server,sampled
go
如何使用窗口函数获得相同的结果?我试过这个:
select server,sampled,
count(case when sampled <= r1.sampled and sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
-- count(case when currentrow.sampled <= r1.sampled and currentrow.sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
from readings r1
order by server,sampled
但结果只是运行计数。任何引用当前行指针的系统变量? currentrow.sampled?
答案 0 :(得分:2)
这不是一个非常令人愉快的答案,但有一种可能性是首先创建一个包含所有分钟的帮助表
CREATE TABLE #DateTimes(datetime datetime primary key);
WITH E1(N) AS
(
SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1)) V(N)
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
,R(StartRange, EndRange)
AS (SELECT MIN(sampled),
MAX(sampled)
FROM readings)
,N(N)
AS (SELECT ROW_NUMBER()
OVER (
ORDER BY (SELECT NULL)) AS N
FROM E8)
INSERT INTO #DateTimes
SELECT TOP (SELECT 1 + DATEDIFF(MINUTE, StartRange, EndRange) FROM R) DATEADD(MINUTE, N.N - 1, StartRange)
FROM N,
R;
然后使用ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
WITH T1 AS
( SELECT Server,
MIN(sampled) AS StartRange,
MAX(sampled) AS EndRange
FROM readings
GROUP BY Server )
SELECT Server,
sampled,
Cnt
FROM T1
CROSS APPLY
( SELECT r.sampled,
COUNT(r.sampled) OVER (ORDER BY N.datetime ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS Cnt
FROM #DateTimes N
LEFT JOIN readings r
ON r.sampled = N.datetime
AND r.server = T1.server
WHERE N.datetime BETWEEN StartRange AND EndRange ) CA
WHERE CA.sampled IS NOT NULL
ORDER BY sampled
以上假设每分钟最多有一个样本,并且所有时间都是精确的分钟。如果这不是真的,则需要另一个表格表达式按日期时间预聚合四舍五入到分钟。
答案 1 :(得分:1)
据我所知,使用窗口函数不能简单地替换子查询。
窗口函数对一组行进行操作,允许您根据分区和顺序使用它们。 您要做的不是我们可以在窗口函数中使用的分区类型。 要生成分区,我们需要能够在这种情况下使用窗口函数,这只会导致代码过于复杂。
我建议cross apply()
替代您的子查询。
我不确定您是否打算将结果限制在9分钟内,但sampled > dateadd(...)
就是原始子查询中发生的情况。
以下是基于将样本分区为10分钟窗口以及cross apply()
版本的窗口函数的样子。
select
r.server
, r.sampled
, CrossApply = x.CountRecent
, OriginalSubquery = (
select count(*)
from readings s
where s.server=r.server
and s.sampled <= r.sampled
/* doesn't include 10 minutes ago */
and s.sampled > dateadd(minute,-10,r.sampled)
)
, Slices = count(*) over(
/* partition by server, 10 minute slices, not the same thing*/
partition by server, dateadd(minute,datediff(minute,0,sampled)/10*10,0)
order by sampled
)
from readings r
cross apply (
select CountRecent=count(*)
from readings i
where i.server=r.server
/* changed to >= */
and i.sampled >= dateadd(minute,-10,r.sampled)
and i.sampled <= r.sampled
) as x
order by server,sampled
结果:http://rextester.com/BMMF46402
+--------+---------------------+------------+------------------+--------+
| server | sampled | CrossApply | OriginalSubquery | Slices |
+--------+---------------------+------------+------------------+--------+
| 1 | 01.01.2017 08:00:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:02:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:05:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:30:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:31:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:37:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:40:00 | 4 | 3 | 1 |
| 1 | 01.01.2017 08:41:00 | 4 | 3 | 2 |
| 1 | 01.01.2017 09:07:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 09:08:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 09:09:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 09:11:00 | 4 | 4 | 1 |
+--------+---------------------+------------+------------------+--------+
答案 2 :(得分:0)
感谢Martin和SqlZim,感谢您的回答。我将针对可以在窗口聚合中使用的%% currentrow之类的东西提出Connect增强请求。我认为这会导致更简单和自然的sql:
选择计数(采样时的情况&lt; = %% currentrow.sampled和采样&gt; dateadd(分钟,-10,%% currentrow.sampled)然后1或其他空结束)(...无论窗口是什么。 ..)
我们已经可以使用这样的表达式:
选择count(采样时的情况&lt; = getdate()和采样&gt; dateadd(分钟,-10,getdate())然后1其他null结束)(...无论窗口是什么......)< / p>
如果我们可以引用当前行中的列,那么思考会很棒。