Question

我有一个peak_hours表，其中某些时间段以小时为单位被定义为“高峰时段”

id | start | end
1  | 05:00 | 09:00
2  | 12:00 | 15:00
3  | 17:00 | 22:00

我有一个jobs表，用于跟踪作业的开始和结束日期。

id |    started_at    |   completed_at
1  | 2019-05-07 04:00 | 2019-05-07 16:00

我正在尝试获取工作处于高峰期的时间和非高峰时间

期望输出：

peak_hours_total | non_peak_hours_total
7                | 5

Answer 1

正如哈里在评论中提到的，一种方法是将日期范围的单行扩展为多行，每行代表所需粒度（小时，分钟等）的值。之所以这样做是因为SQL Server在使用范围时效率不是很高，而且交易数据可能会持续数天。

以下示例将数据扩展为分钟级别的粒度，并给出了所需的结果。请记住，我没有花时间尝试优化代码，因此肯定有改进的空间：

-- Input
;with PeakHours as (
    select 1 as id, '05:00' as [start], '09:00' as [end]
    union all
    select 2 as id, '12:00' as [start], '15:00' as [end]
    union all
    select 3 as id, '17:00' as [start], '22:00' as [end]
)
, data as (
    select 1 as id, '2019-05-07 04:00' as started_at, '2019-05-07 16:00' as completed_at
)
-- Convert start and end to UNIX to be able to get ranges
, data2 as (
    select *
        ,DATEDIFF(s, '1970-01-01', started_at) as started_at_unix
        ,DATEDIFF(s, '1970-01-01', completed_at) as completed_at_unix
    from data
)
-- Find min start and max end to cover whole possible range
, data3 as (
    select min(started_at_unix) as min_started_at_unix, max(completed_at_unix) as max_completed_at_unix
    from data2
)
-- expand data using Tally table technique
,lv0 AS (SELECT 0 g UNION ALL SELECT 0)
,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) 
,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) 
,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) 
,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) 
,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) 
,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
, data_expanded as (
    SELECT TOP (select (max_completed_at_unix - min_started_at_unix) / 60 from data3) (n - 1) * 60 + d3.min_started_at_unix as unix_timestamp_min
    from Tally as t
    cross apply data3 as d3
)
-- Aggregate
select
     1.0 * sum(case when ph.id is not null then 1 else 0 end) / 60 as peak_hours_total 
    ,1.0 * sum(case when ph.id is null then 1 else 0 end) / 60 as non_peak_hours_total
from data_expanded as de
inner join data2 as d2
    on de.unix_timestamp_min between d2.started_at_unix and d2.completed_at_unix
left join PeakHours as ph
    on cast(dateadd(s, de.unix_timestamp_min, '1970-01-01') as time(0)) between ph.[start] and dateadd(SECOND, -1, cast(ph.[end] as time(0)))

从表格中获取高峰和非高峰时间

1 个答案: