问:如何根据1列的变化值对记录进行排名?
我有以下数据(https://pastebin.com/vdTb1JRT):
EmployeeID Date Onleave
ABH12345 2016-01-01 0
ABH12345 2016-01-02 0
ABH12345 2016-01-03 0
ABH12345 2016-01-04 0
ABH12345 2016-01-05 0
ABH12345 2016-01-06 0
ABH12345 2016-01-07 0
ABH12345 2016-01-08 0
ABH12345 2016-01-09 0
ABH12345 2016-01-10 1
ABH12345 2016-01-11 1
ABH12345 2016-01-12 1
ABH12345 2016-01-13 1
ABH12345 2016-01-14 0
ABH12345 2016-01-15 0
ABH12345 2016-01-16 0
ABH12345 2016-01-17 0
我想产生以下结果:
EmployeeID DateValidFrom DateValidTo OnLeave
ABH12345 2016-01-01 2016-01-09 0
ABH12345 2016-01-10 2016-01-13 1
ABH12345 2016-01-14 2016-01-17 0
所以我想我是否可以以某种方式创建一个排名列(如下所示),该列根据Onleave列中的值递增 - 由EmployeeID列分区。
EmployeeID Date Onleave RankedCol
ABH12345 2016-01-01 0 1
ABH12345 2016-01-02 0 1
ABH12345 2016-01-03 0 1
ABH12345 2016-01-04 0 1
ABH12345 2016-01-05 0 1
ABH12345 2016-01-06 0 1
ABH12345 2016-01-07 0 1
ABH12345 2016-01-08 0 1
ABH12345 2016-01-09 0 1
ABH12345 2016-01-10 1 2
ABH12345 2016-01-11 1 2
ABH12345 2016-01-12 1 2
ABH12345 2016-01-13 1 2
ABH12345 2016-01-14 0 3
ABH12345 2016-01-15 0 3
ABH12345 2016-01-16 0 3
ABH12345 2016-01-17 0 3
然后我可以做到以下几点:
SELECT
[EmployeeID] = [EmployeeID]
,[DateValidFrom] = MIN([Date])
,[DateValidTo] = MAX([Date])
,[OnLeave] = [OnLeave]
FROM table/view/cte/sub-query
GROUP BY
[EmployeeID]
,[OnLeave]
,[RankedCol]
其他解决方案非常受欢迎..
以下是测试数据:
WITH CTE AS ( SELECT EmployeeID = 'ABH12345', [Date] = CAST(N'2016-01-01' AS Date), [Onleave] = 0
UNION SELECT 'ABH12345', CAST(N'2016-01-02' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-03' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-04' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-05' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-06' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-07' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-08' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-09' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-10' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-11' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-12' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-13' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-14' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-15' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-16' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-17' AS Date), 0
)
SELECT * FROM CTE
答案 0 :(得分:3)
使用lag
执行此操作的另一种方法。通过获取每个employeeid的先前Onleave值来分配组,并在找到不同的值时重置它。
select employeeid,min(date) as date_from,max(date) as date_to,max(onleave) as onleave
from (select t.*,sum(case when prev_ol=onleave then 0 else 1 end) over(partition by employeeid order by date) as grp
from (select c.*,lag(onleave,1,onleave) over(partition by employeeid order by date) as prev_ol
from cte c
) t
) t
group by employeeid,grp
答案 1 :(得分:2)
这是群岛问题的一个例子。在这种情况下,您可以使用日期算术。关键的观察是从日期列中减去整数序列可以识别具有相似值的岛。
作为查询,这看起来像:
SELECT EmployeeId, MIN([Date]) as DateValidFrom, MAX([Date]) as DateValidTo,
OnLeave
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY EmployeeId, OnLeave ORDER BY [Date]) as seqnum
FROM t
) t
GROUP BY EmployeeID, DATEADD(day, - seqnum, [Date]), OnLeave;
您可以运行子查询,查看结果,然后执行算术以查看其工作原理。
这是example。
答案 2 :(得分:2)
这是获得所需输出的另一种更简单的方法 - 只访问一次表。
-- sample of data from your question
with t1(EmployeeID, Date1, Onleave) as(
select 'ABH12345', cast('2016-01-01' as date), 0 union all
select 'ABH12345', cast('2016-01-02' as date), 0 union all
select 'ABH12345', cast('2016-01-03' as date), 0 union all
select 'ABH12345', cast('2016-01-04' as date), 0 union all
select 'ABH12345', cast('2016-01-05' as date), 0 union all
select 'ABH12345', cast('2016-01-06' as date), 0 union all
select 'ABH12345', cast('2016-01-07' as date), 0 union all
select 'ABH12345', cast('2016-01-08' as date), 0 union all
select 'ABH12345', cast('2016-01-09' as date), 0 union all
select 'ABH12345', cast('2016-01-10' as date), 1 union all
select 'ABH12345', cast('2016-01-11' as date), 1 union all
select 'ABH12345', cast('2016-01-12' as date), 1 union all
select 'ABH12345', cast('2016-01-13' as date), 1 union all
select 'ABH12345', cast('2016-01-14' as date), 0 union all
select 'ABH12345', cast('2016-01-15' as date), 0 union all
select 'ABH12345', cast('2016-01-16' as date), 0 union all
select 'ABH12345', cast('2016-01-17' as date), 0
)
-- actual query
select max(w.employeeid) as employeeid
, min(w.date1) as datevalidfrom
, max(w.date1) as datevalidto
, max(w.onleave) as onleave
from (
select row_number() over(partition by employeeid order by date1) -
row_number() over(partition by employeeid, onleave order by date1) as grp
, employeeid
, date1
, onleave
from t1 s
) w
group by w.grp
order by employeeid, datevalidfrom
结果:
employeeid datevalidfrom datevalidto onleave
---------- ------------- ----------- -----------
ABH12345 2016-01-01 2016-01-09 0
ABH12345 2016-01-10 2016-01-13 1
ABH12345 2016-01-14 2016-01-17 0