我的数据集如下 - 在此数据集中,缺少start_location,因为activity_type为No。所以基本上我想创建一个列,其中每个间隙的最小日期为activity_date(表示没有start_location的记录)。如果5/07/2015,6 / 07 / 2015,7 / 07/2015将start_location设为null,那么新列即new_activity_date必须是05/07/2015的第一个差距日期。
基本上这是找到差距期的第一个日期。 数据如下 -
id activity_type activity_date start_location end_location
27151 Yes 4/07/2015 18 18
27151 No 5/07/2015
27151 No 6/07/2015
27151 Yes 7/07/2015 18 17
27151 Yes 8/07/2015 18 17
27151 Yes 9/07/2015 18 17
27151 Others 19/07/2015 17 17
27151 Others 20/07/2015 17 17
27151 No 21/07/2015
27151 No 22/07/2015
27151 No 23/07/2015
27151 Yes 24/07/2015 17 17
27151 Yes 25/07/2015 17 17
27151 Yes 26/07/2015 17 17
27151 Yes 27/07/2015 17 17
我的数据应该是 -
id activity_type activity_date start_location end_location new_activity_date
27151 Yes 4/07/2015 18 18 4/07/2015
27151 No 5/07/2015 4/07/2015
27151 No 6/07/2015 4/07/2015
27151 Yes 7/07/2015 18 17 7/07/2015
27151 Yes 8/07/2015 18 17 8/07/2015
27151 Yes 9/07/2015 18 17 9/07/2015
27151 Others 19/07/2015 17 17 19/07/2015
27151 Others 20/07/2015 17 17 20/07/2015
27151 No 21/07/2015 20/07/2015
27151 No 22/07/2015 20/07/2015
27151 No 23/07/2015 20/07/2015
27151 Yes 24/07/2015 17 17 24/07/2015
27151 Yes 25/07/2015 17 17 25/07/2015
27151 Yes 26/07/2015 17 17 26/07/2015
27151 Yes 27/07/2015 17 17 27/07/2015
先谢谢,不知道我在哪里误解了这个伎俩。
答案 0 :(得分:0)
我想我知道你想要什么。我认为函数最有效,然后你可以在插入或更新中使用它:
create function dbo.get_New_Activity_Date(@Activity_Date datetime)
returns datetime
As
Begin
declare @result datetime
set @result=
(select min(activity_date)
from myTable
where activity_date<=@activity_Date
and activity_date>
(select max(activity_date)
from myTable t2
where t2.start_location is not null -- if using null else test for blank
and t2.activity_date <=@activity_date
)
)
return isnull(@result,@activity_date)
end
如果您只是想要一个select语句在不使用函数的情况下将此表提取为ETL作业的一部分:
select
id
,activity_type
,activity_date
,start_location
,end_location
,isnull((select min(t1.activity_date)
from myTable t1
where t1.id=myTable.id
and t1.activity_date<=myTable.activity_Date
and t1.activity_date>
(select max(t2.activity_date)
from myTable t2
where t2.id=myTable.id
and t2.start_location is not null -- if using null else test for blank
and t2.activity_date <=myTable.activity_date
)
),activity_date) new_activity_date
from
myTable
我的结果:
答案 1 :(得分:0)
嗯。这对我来说似乎是条件累积最大值:
select t.*,
max(case when activity_type <> 'No' then activity_date end) over (order by id, date) as new_activity_date
from t;