我有这个数据集,我有一个YYYYMM格式的时间序列。我有两列基本上作为真/假标志。我想基于这些检索当前范围的真/假标志添加两个额外的列:
Default Cure
201301 0 NULL
201302 0 NULL
201303 0 NULL
201304 1 NULL
201305 1 NULL
201306 1 NULL
201307 1 NULL
201308 NULL 0
201309 NULL 0
201310 NULL 1
201311 0 NULL
201312 0 NULL
201401 0 NULL
201402 0 NULL
201403 1 NULL
201404 1 NULL
201405 0 NULL
201406 0 NULL
201407 NULL 1
201408 NULL 0
201409 NULL 0
201410 NULL 0
201411 NULL 0
201412 NULL 0
在此数据集中,您可以看到在201304,05,06,07期间将默认列设置为1,并且在201310期间将Cure列设置为1.
这基本上意味着默认时间序列从201304期到201310期间有效。最终我想生成以下集:
Default Cure DefaultPeriod CurePeriod
201301 0 NULL NULL NULL
201302 0 NULL NULL NULL
201303 0 NULL NULL NULL
201304 1 NULL 201304 201310
201305 1 NULL 201304 201310
201306 1 NULL 201304 201310
201307 1 NULL 201304 201310
201308 NULL 0 201304 201310
201309 NULL 0 201304 201310
201310 NULL 1 201304 201310
201311 0 NULL NULL NULL
201312 0 NULL NULL NULL
201401 0 NULL NULL NULL
201402 0 NULL NULL NULL
201403 1 NULL 201403 201407
201404 1 NULL 201403 201407
201405 0 NULL 201403 201407
201406 0 NULL 201403 201407
201407 NULL 1 201403 201407
201408 NULL 0 NULL NULL
201409 NULL 0 NULL NULL
201410 NULL 0 NULL NULL
201411 NULL 0 NULL NULL
201412 NULL 0 NULL NULL
可能会出现多个范围,但它们不能重叠。我将如何实现这一目标。我曾尝试在同一张桌子上进行各种最小/最大周期加入,但我似乎找不到合适的解决方案。
答案 0 :(得分:1)
这是一位真正的思想家:)
基本上我在“治愈”日期(c1)上划分数据,为每个组编号(c2),然后在每个组中查找分钟和最大值(c3 C4),然后应用一些逻辑来过滤掉行在分钟之前来临。
declare @t table
(
[Month] varchar(6),
[Default] bit,
[Cure] bit
);
insert into @t values('201301', 0, NULL);
insert into @t values('201302', 0, NULL);
insert into @t values('201303', 0, NULL);
insert into @t values('201304', 1, NULL);
insert into @t values('201305', 1, NULL);
insert into @t values('201306', 1, NULL);
insert into @t values('201307', 1, NULL);
insert into @t values('201308', NULL, 0);
insert into @t values('201309', NULL, 0);
insert into @t values('201310', NULL, 1);
insert into @t values('201311', 0, NULL);
insert into @t values('201312', 0, NULL);
insert into @t values('201401', 0, NULL);
insert into @t values('201402', 0, NULL);
insert into @t values('201403', 1, NULL);
insert into @t values('201404', 1, NULL);
insert into @t values('201405', 0, NULL);
insert into @t values('201406', 0, NULL);
insert into @t values('201407', NULL, 1);
insert into @t values('201408', NULL, 0);
insert into @t values('201409', NULL, 0);
insert into @t values('201410', NULL, 0);
insert into @t values('201411', NULL, 0);
insert into @t values('201412', NULL, 0);
with c1 as
(
select min([Month]) [Month], 1 x from @t
union all
select [Month],1 from @t
where Cure = 1
),
c2 as
(
select t.[Month],[Default],[Cure],
sum(x) over (order by t.[Month] rows between unbounded preceding and 1 preceding) grp
from @t t
left outer join c1 on c1.[Month] = t.[Month]
),
c3 as
(
select grp, min([Month]) [Month]
from c2
where [Default] = 1
group by grp
),
c4 as
(
select grp, max([Month]) [Month]
from c2
where [Cure] = 1
group by grp
)
select c2.[Month], c2.[Default], c2.[Cure],
case when c2.[Month] >= c3.[Month] then c3.[Month] else null end as DefaultPeriod,
case when c2.[Month] >= c3.[Month] then c4.[Month] else null end as CurePeriod
from c2
left outer join c3 on c2.grp = c3.grp
left outer join c4 on c2.grp = c4.grp