我需要根据每月文件中的日期创建一个日期范围,其结果按EmpID和StatusCode分组。
“开始”日期将是最早月份的开始日期,“结束”日期将是分组中最后一个月的最后一天。谢谢!
表:
EmpID StatusCode AsOF
12345 J 6/30/2014
12345 J 7/31/2014
12345 J 8/29/2014
12345 J 9/30/2014
12345 G 10/31/2014
12345 G 11/28/2014
12345 G 12/31/2014
12345 G 1/30/2015
12345 G 2/27/2015
12345 M 3/30/2015
12345 M 4/30/2015
12345 M 5/29/2015
12345 M 6/30/2015
12345 G 7/31/2015
12345 G 8/31/2015
12345 G 9/30/2015
12345 G 10/30/2015
12345 G 11/30/2015
预期结果:
EmpID StatusCode Start End
12345 J 6/1/2014 9/30/2014
12345 G 10/1/2014 2/28/2015
12345 M 3/1/2015 6/30/2015
12345 G 7/1/2015 11/30/2015
答案 0 :(得分:2)
group by
逻辑不仅取决于EmpID
和StatusCode
(如问题中所述),还取决于所谓的gaps-and-islands。例如,预期输出有EmpID
12345和StatusCode
G的2条记录,因为有2个岛屿(2014/10 - 2015/2和2015/7 - 2015/11),它们之间存在差距(2015/3 - 2015/6)。
示例数据:
我使用了@RazvanSocol非常好的样本数据来打字,但是这里包含它以防万一他的答案稍后修改。
CREATE TABLE #sample_Data (
EmpID int NOT NULL,
StatusCode CHAR(1) NOT NULL,
AsOf DATE NOT NULL,
UNIQUE (EmpID, AsOf)
)
INSERT INTO #sample_Data (EmpID, StatusCode, AsOf) VALUES
(12345,'J','20140630'),
(12345,'J','20140731'),
(12345,'J','20140829'),
(12345,'J','20140930'),
(12345,'G','20141031'),
(12345,'G','20141128'),
(12345,'G','20141231'),
(12345,'G','20150130'),
(12345,'G','20150227'),
(12345,'M','20150330'),
(12345,'M','20150430'),
(12345,'M','20150529'),
(12345,'M','20150630'),
(12345,'G','20150731'),
(12345,'G','20150831'),
(12345,'G','20150930'),
(12345,'G','20151030'),
(12345,'G','20151130')
<强>答案:强>
假设您使用的是SQL Server 2012或更高版本,此答案将起作用。下面的查询利用Window Functions等lag
和sum
来确定群岛的开始位置并为其分配IslandNbr
。在最终的外部查询中,有一个datediff
计算用于确定该月的第一天,还有一个eomonth
函数用于确定其相应输入AsOf
日期的月份的最后一天
select b.EmpID
, b.StatusCode
, cast(dateadd(month, datediff(month, 0, min(b.AsOf)), 0)as date) as [Start]
, eomonth(max(b.AsOf)) as [End]
from (
select a.EmpID
, a.StatusCode
, a.AsOf
, sum(a.IslandBegin) over (partition by a.EmpID, a.StatusCode order by a.AsOf) as IslandNbr
from (
select d.EmpID
, d.StatusCode
, d.AsOf
, case when datediff(month, lag(d.AsOf, 1, null) over (partition by d.EmpID, d.StatusCode order by d.AsOf asc), d.AsOf) = 1 then 0 else 1 end as IslandBegin
from #sample_Data as d
) as a
) as b
group by b.EmpID
, b.StatusCode
, b.IslandNbr
order by 3
<强>输出:强>
输出完全符合预期结果。
+-------+------------+------------+------------+
| EmpID | StatusCode | Start | End |
+-------+------------+------------+------------+
| 12345 | J | 2014-06-01 | 2014-09-30 |
| 12345 | G | 2014-10-01 | 2015-02-28 |
| 12345 | M | 2015-03-01 | 2015-06-30 |
| 12345 | G | 2015-07-01 | 2015-11-30 |
+-------+------------+------------+------------+
答案 1 :(得分:1)
您可以使用以下内容:
/*
CREATE TABLE Table1 (
EmpID int NOT NULL,
StatusCode CHAR(1) NOT NULL,
AsOf DATE NOT NULL,
UNIQUE (EmpID, AsOf)
)
INSERT INTO dbo.Table1 (EmpID, StatusCode, AsOf) VALUES
(12345,'J','20140630'),
(12345,'J','20140731'),
(12345,'J','20140829'),
(12345,'J','20140930'),
(12345,'G','20141031'),
(12345,'G','20141128'),
(12345,'G','20141231'),
(12345,'G','20150130'),
(12345,'G','20150227'),
(12345,'M','20150330'),
(12345,'M','20150430'),
(12345,'M','20150529'),
(12345,'M','20150630'),
(12345,'G','20150731'),
(12345,'G','20150831'),
(12345,'G','20150930'),
(12345,'G','20151030'),
(12345,'G','20151130')
*/
SELECT DISTINCT Q2.EmpID, Q2.StatusCode,
DATEADD(MONTH,DATEDIFF(MONTH,'20000101',Q2.FirstAsOf),'20000101') AS StartDate,
DATEADD(DAY,-1,DATEADD(MONTH,DATEDIFF(MONTH,'20000101',Q2.LastAsOf)+1,'20000101')) AS EndDate
FROM (
SELECT *,
MIN(Q1.AsOf) OVER (PARTITION BY Q1.EmpID,Q1.StatusCode,Q1.Dif) AS FirstAsOf,
MAX(Q1.AsOf) OVER (PARTITION BY Q1.EmpID,Q1.StatusCode,Q1.Dif) AS LastAsOf
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY t.EmpID ORDER BY t.AsOf)
-ROW_NUMBER() OVER (PARTITION BY t.EmpID,t.StatusCode ORDER BY t.AsOf) AS Dif
FROM dbo.Table1 t
) Q1
) Q2
ORDER BY EmpID, StartDate