我对此处提出的问题有一个非常类似的问题:Merge duplicate temporal records in database
这里的区别是,我需要结束日期是实际日期而不是NULL。
所以给出以下数据:
EmployeeId StartDate EndDate Column1 Column2
1000 2009/05/01 2010/04/30 X Y
1000 2010/05/01 2011/04/30 X Y
1000 2011/05/01 2012/04/30 X X
1000 2012/05/01 2013/04/30 X Y
1000 2013/05/01 2014/04/30 X X
1000 2014/05/01 2014/06/01 X X
期望的结果是:
EmployeeId StartDate EndDate Column1 Column2
1000 2009/05/01 2011/04/30 X Y
1000 2011/05/01 2012/04/30 X X
1000 2012/05/01 2013/04/30 X Y
1000 2013/05/01 2014/06/01 X X
链接线程中提出的解决方案是:
with t1 as --tag first row with 1 in a continuous time series
(
select t1.*, case when t1.column1=t2.column1 and t1.column2=t2.column2
then 0 else 1 end as tag
from test_table t1
left join test_table t2
on t1.EmployeeId= t2.EmployeeId and dateadd(day,-1,t1.StartDate)= t2.EndDate
)
select t1.EmployeeId, t1.StartDate,
case when min(T2.StartDate) is null then null
else dateadd(day,-1,min(T2.StartDate)) end as EndDate,
t1.Column1, t1.Column2
from (select t1.* from t1 where tag=1 ) as t1 -- to get StartDate
left join (select t1.* from t1 where tag=1 ) as t2 -- to get a new EndDate
on t1.EmployeeId= t2.EmployeeId and t1.StartDate < t2.StartDate
group by t1.EmployeeId, t1.StartDate, t1.Column1, t1.Column2;
但是,当您需要结束日期而不是NULL时,这似乎不起作用。
有人可以帮我解决这个问题吗?
答案 0 :(得分:0)
这是另一种解决方案(取自How do I group on continuous ranges)。编码更简单,并且还满足NULL值(即,与简单的LAG()比较不同,处理NULL = NULL)。但是,由于GROUP BY
SELECT EmployeeId
, MIN(StartDate) AS StartDate
, MAX(EndDate) AS EndDate
, Column1
, Column2
FROM
(
SELECT t.*
, ROW_NUMBER() OVER(PARTITION BY EmployeeId, Column1, Column2 ORDER BY StartDate ) AS GRN
, ROW_NUMBER() OVER(PARTITION BY EmployeeId ORDER BY StartDate ) AS RN
FROM
test_table t
) t
GROUP BY
EmployeeId
, Column1
, Column2
, RN - GRN