我为用户启动和停止程序提供了一组数据。我需要确定每个实例的运行总时间。但是,如果该程序在同一天停止并启动,我需要它是连续的。
最终结果应为:
User Start End EventId
--------------------------------------
X 1/1/2016 1/1/2016 1
X 1/1/2016 1/5/2016 1
X 1/5/2016 1/10/2016 1
X 1/10/2016 1/13/2016 1
X 12/20/2016 12/26/2016 2
Y 01/01/2016 01/01/2016 3
Y 01/01/2016 01/02/2016 3
Y 01/04/2016 01/10/2016 4
或:
User EventId DurationDays
------------------------------
X 1 13
Y 2 6
Y 3 2
Y 4 6
但是我认为,如果有人可以帮助我正确地对他们进行分组,那么我可以很轻松地解决这个问题。
下表是我取得的成绩:
User Start End LagStart LagStop
-------------------------------------------------
X 1/1/2016 1/1/2016 Startgroup
X 1/1/2016 1/5/2016 Follow
X 1/5/2016 1/10/2016 Follow
X 1/10/2016 1/13/2016 Follow StopGroup
X 12/20/2016 12/26/2016 StartGroup StopGroup
X 12/26/2016 12/30/2016 Startgroup StopGroup
Y 01/01/2016 01/01/2016 StartGroup
Y 01/01/2016 01/02/2016 StartGroup StopGroup
Y 01/04/2016 01/10/2016 StartGroup StopGroup
我为创建一个新的唯一ID而感到困惑,这些ID从每个“开始组”开始,到每个“停止组”都结束
如果有助于查看这些数据集,请参见以下内容:
select
a.user_start_key as firstStartKey,
a.user_end_key as firstEndKey,
a.start_dt as firstStartDate,
a.end_dt as firstDisch,
a.rnkkey as firstRank,
nextRec.user_start_key as nextStart,
nextRec.start_dt,
nextRec.max_rank,
case
when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) is null
then 'StartGroup'
when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) in (a.rnkkey)
then 'Follow'
else 'Start'
end as LagStart,
case
when lead(a.rnkkey, 1) over (order by a.rnkkey) is null
then 'StopGroup'
when lead(a.rnkkey, 1) over (order by a.rnkkey) <> nextRec.max_rank
then 'StopGroup'
else Null
end as Lagstop
from
#rnk1 a
inner join
(Select Distinct
user_start_key,
start_dt,
--dschrg_dt,
max(rnkkey) over (partition by user_start_key order by end_dt desc) max_rank
from
#rnk1) nextRec on a.user_end_key = nextRec.user_start_key
“ User_ [state] _key”字段只是我为每个user_id按日期构建唯一的密钥,因为有多个用户,我需要将它们分别分组。
如果需要进一步说明,请告诉我。感谢任何可以提供帮助的人。
答案 0 :(得分:0)
这是一个使用累积总和来计算排名的示例。
这样就可以将排名用于分组。
-- Using a table variable for easy testing
declare @T table (id int identity(1,1) primary key, [User] varchar(8), startdate date, enddate date);
-- Sample data
insert into @T ([User], startdate, enddate) values
('X','2018-01-01','2018-01-01')
,('X','2018-01-01','2018-01-05')
,('X','2018-01-05','2018-01-10')
,('X','2018-01-10','2018-01-13')
,('X','2018-12-20','2018-12-26')
,('Y','2018-01-01','2018-01-01')
,('Y','2018-01-01','2018-01-02')
,('Y','2018-01-04','2018-01-10')
;
select
[User],
cumm_sum_rank as EventId,
datediff(day, min(startdate), max(enddate))+1 as DurationDays
, min(startdate) as [Start]
, max(enddate) as [End]
from
(
select *,
sum(startdate_diff_prev_enddate) over (order by [User], startdate, enddate) as cumm_sum_rank
from
(
select [User], startdate, enddate,
iif(startdate = lag(enddate) over (partition by [User] order by startdate, enddate),0,1) as startdate_diff_prev_enddate
from @T
) as q1
) as q2
group by [User], cumm_sum_rank;