我正在寻找通过datetime值在分区上运行窗口函数的最佳方法。然而,不是按照确切的时间进行分区,而是希望按日期时间进行分区,例如在彼此的15分钟内。
这是我桌子的一小部分。
CREATE TABLE my_table(ID VARCHAR(5), in_time DATETIME)
INSERT INTO my_table (ID, in_time) VALUES
('4844', '2017-04-06 10:15:00.000'),
('5221', '2017-11-24 11:18:00.000'),
('5221', '2017-11-24 11:18:00.000'),
('5221', '2017-11-25 14:23:00.000'),
('8486', '2017-10-10 15:30:00.000'),
('8486', '2017-10-10 15:32:00.000'),
('8486', '2017-10-10 15:46:00.000'), -- new row after updating question
('8486', '2017-10-10 16:00:00.000') -- new row after updating question
以下是我现在使用的查询:
SELECT *,
ROW_NUMBER() OVER(PARTITION BY ID, in_time ORDER BY ID, in_time) AS filter_row
FROM my_table
正如预期的那样给了我:
ID in_time filter_row
4844 2017-04-06 10:15:00.000 1
5221 2017-11-24 11:18:00.000 1
5221 2017-11-24 11:18:00.000 2
5221 2017-11-25 14:23:00.000 1
8486 2017-10-10 15:30:00.000 1
8486 2017-10-10 15:32:00.000 1
8486 2017-10-10 15:46:00.000 1
8486 2017-10-10 16:00:00.000 1
我想要达到的目的是:
ID in_time filter_row
4844 2017-04-06 10:15:00.000 1
5221 2017-11-24 11:18:00.000 1
5221 2017-11-24 11:18:00.000 2
5221 2017-11-25 14:23:00.000 1
8486 2017-10-10 15:30:00.000 1
8486 2017-10-10 15:32:00.000 2 -- < notice the 2 here
8486 2017-10-10 15:46:00.000 3 -- < notice the 3 here
8486 2017-10-10 16:00:00.000 4 -- < notice the 4 here
如上所示,ID = 8486
的行应该分区在一起,因为它们各自in_time
和in_time
之间的行只有2分钟,14分钟和14分钟。如何有效地做到这一点?
答案 0 :(得分:2)
以下示例通过基于指定的间隔(以分钟为单位)计算间隔开始时间并按该值进行分区来提供所需的结果。
DECLARE @IntervalMinutes int = 15;
SELECT *,
ROW_NUMBER() OVER(
PARTITION BY ID
, (DATEADD(minute, (DATEDIFF(minute, '', in_time)/@IntervalMinutes)*@IntervalMinutes, '')
)
ORDER BY ID, in_time) AS filter_row
FROM my_table;
修改强>
上述代码计算固定长度间隔。您可以通过ID
标识超出所需间隔的岛屿来解决您更新的问题。以下方法使用NOT EXISTS
和CROSS APPLY
来识别这些岛屿,并确定每个岛屿的间隔开始和结束时间。
DECLARE @IntervalMinutes int = 15;
WITH
start_intervals AS (
SELECT DISTINCT
ID
, in_time
FROM dbo.my_table AS a
WHERE NOT EXISTS(
SELECT 1
FROM dbo.my_table AS b
WHERE
b.ID = a.ID
AND b.in_time < a.in_time
AND b.in_time > DATEADD(minute, -@IntervalMinutes, a.in_time)
)
)
, end_intervals AS (
SELECT
ID
, in_time
FROM dbo.my_table AS a
WHERE NOT EXISTS(
SELECT 1
FROM dbo.my_table AS b
WHERE
b.ID = a.ID
AND b.in_time > a.in_time
AND b.in_time < DATEADD(minute, @IntervalMinutes, a.in_time)
)
)
, intervals AS (
SELECT
ID
, start_intervals.in_time AS start_interval
, end_intervals.in_time AS end_interval
FROM start_intervals
CROSS APPLY(
SELECT TOP(1) in_time
FROM end_intervals
WHERE
end_intervals.ID = start_intervals.ID
AND end_intervals.in_time >= start_intervals.in_time
) AS end_intervals
)
SELECT
my_table.ID
, my_table.in_time
, ROW_NUMBER() OVER(PARTITION BY my_table.ID, intervals.start_interval ORDER BY(intervals.start_interval)) AS filter_row
FROM dbo.my_table
JOIN intervals ON my_table.in_time BETWEEN intervals.start_interval AND intervals.end_interval