给出一个约会表,如下:
User Start End
UserA 2016-01-15 12:00:00 2016-01-15 14:00:00
UserA 2016-01-15 15:00:00 2016-01-15 17:00:00
UserB 2016-01-15 13:00:00 2016-01-15 15:00:00
UserB 2016-01-15 13:32:00 2016-01-15 15:00:00
UserB 2016-01-15 15:30:00 2016-01-15 15:30:00
UserB 2016-01-15 15:45:00 2016-01-15 16:00:00
UserB 2016-01-15 17:30:00 2016-01-15 18:00:00
我想创建一个不同时间间隔的列表,其中相同数量的人有约会:
Start End Count
2016-01-15 12:00:00 2016-01-15 13:00:00 1
2016-01-15 13:00:00 2016-01-15 14:00:00 2
2016-01-15 14:00:00 2016-01-15 15:45:00 1
2016-01-15 15:45:00 2016-01-15 16:00:00 2
2016-01-15 16:00:00 2016-01-15 17:00:00 1
2016-01-15 17:00:00 2016-01-15 17:30:00 0
2016-01-15 17:30:00 2016-01-15 18:00:00 1
我如何在SQL中执行此操作,最好是SQL Server 2008?
编辑:澄清:手动,通过为每个用户创建一行,标记阻塞时间,然后总结具有标记的行数来获得结果:
Time 12 13 14 15 16 17
UserA xxxxxxxx xxxxxxxx
UserB xxxxxxxx x xx
Count 1 2 1 21 0 1
该结果集将从可用的最短时间开始,以可用的最大时间结束,而ASCII艺术只有15分钟的分辨率,我至少需要分辨率。我想你可以留下行" 0"结果,如果这对你来说更容易。
答案 0 :(得分:4)
必须有一种比这更简单的方法,但至少你可以单独遵循每一步:
declare @t table ([User] varchar(19) not null,Start datetime2 not null,[End] datetime2 not null)
insert into @t([User], Start, [End]) values
('UserA','2016-01-15T12:00:00','2016-01-15T14:00:00'),
('UserA','2016-01-15T15:00:00','2016-01-15T17:00:00'),
('UserB','2016-01-15T13:00:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T13:32:00','2016-01-15T15:00:00'),
('UserB','2016-01-15T15:30:00','2016-01-15T15:30:00'),
('UserB','2016-01-15T15:45:00','2016-01-15T16:00:00'),
('UserB','2016-01-15T17:30:00','2016-01-15T18:00:00')
;With Times as (
select Start as Point from @t
union
select [End] from @t
), Ordered as (
select Point,ROW_NUMBER() OVER (ORDER BY Point) as rn
from Times
), Periods as (
select
o1.Point as Start,
o2.Point as [End]
from
Ordered o1
inner join
Ordered o2
on
o1.rn = o2.rn - 1
), UserCounts as (
select p.Start,p.[End],COUNT(distinct [User]) as Cnt,ROW_NUMBER() OVER (Order BY p.[Start]) as rn
from
Periods p
left join
@t t
on
p.Start < t.[End] and
t.Start < p.[End]
group by
p.Start,p.[End]
), Consolidated as (
select uc.*
from
UserCounts uc
left join
UserCounts uc_anti
on
uc.rn = uc_anti.rn + 1 and
uc.Cnt = uc_anti.Cnt
where
uc_anti.Cnt is null
union all
select c.Start,uc.[End],c.Cnt,uc.rn
from
Consolidated c
inner join
UserCounts uc
on
c.Cnt = uc.Cnt and
c.[End] = uc.Start
)
select
Start,MAX([End]) as [End],Cnt
from
Consolidated
group by
Start,Cnt
order by Start
CTE是 - Times
- 因为任何给定的开始或结束标记可以在最终结果中开始或结束一段时间,我们只需将它们全部放在一列中 - 所以{{1}可以对它们进行编号,以便Ordered
可以将它们重新组合到每个可能的最小周期内。
Periods
返回原始数据,找出每个计算周期重叠的用户数。
UserCounts
是最棘手的CTE,但它基本上是在用户数量相等的情况下合并彼此相邻的时段。
结果:
Consolidated
(我甚至得到零排,我不确定我能不能存在)
答案 1 :(得分:0)
如果您有一个calendar表格,这种查询会更容易编写。但是在这个例子中,我使用recursive CTE动态构建了一个。 CTE返回约会块,然后我们可以将其加入约会数据。我无法确定样本数据中的间隔模式,因此我以一小时的块显示结果。您可以修改此部分,也可以在第二个表中定义自己的部分。
示例数据
/* Table variables make sharing data easier
*/
DECLARE @Sample TABLE
(
[User] VARCHAR(50),
[Start] DATETIME,
[End] DATETIME
)
;
INSERT INTO @Sample
(
[User],
[Start],
[End]
)
VALUES
('UserA', '2016-01-15 12:00:00', '2016-01-15 14:00:00'),
('UserA', '2016-01-15 15:00:00', '2016-01-15 17:00:00'),
('UserB', '2016-01-15 13:00:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 13:32:00', '2016-01-15 15:00:00'),
('UserB', '2016-01-15 15:30:00', '2016-01-15 15:30:00'),
('UserB', '2016-01-15 15:45:00', '2016-01-15 16:00:00'),
('UserB', '2016-01-15 17:30:00', '2016-01-15 18:00:00')
;
我使用了两个变量来将返回的结果限制为只包含在给定起点和终点内的约会。
/* Set an start and end point for the next query
*/
DECLARE @Start DATETIME = '2016-01-15 12:00:00';
DECLARE @End DATETIME = '2016-01-15 18:00:00';
WITH Calendar AS
(
/* Anchor returns start of first appointment
*/
SELECT
@Start AS [Start],
DATEADD(SECOND, -1, DATEADD(HOUR, 1, @Start)) AS [End]
UNION ALL
/* Recursion, keep adding new records until end of last appointment
*/
SELECT
DATEADD(HOUR, 1, [Start]) AS [Start],
DATEADD(HOUR, 1, [End]) AS [End]
FROM
Calendar
WHERE
[End] <= @End
)
SELECT
c [Start],
c [End],
COUNT(DISTINCT s [User]) AS [Count]
FROM
Calendar AS c
LEFT OUTER JOIN @Sample AS s ON s [Start] BETWEEN c [Start] AND c [End]
OR s [End] BETWEEN c [Start] AND c [End]
GROUP BY
c [Start],
c [End]
;
由于约会可能超过一小时,因此可能会导致超过一行。这解释了为什么7个样本行导致返回总数为9。