我想'清洁'数据集并声明一个新变量,然后根据排名输入日期。
我的数据集如下所示:
Warning message:
In matrix(time, nrow = 5, ncol = 2) :
data length [11] is not a sub-multiple or multiple of the number of rows [5]
我基本上想要将等级1的开始日期输入到等级2的结束日期,或者将输入开始5输入到结束6(总是-1)。
已编写以下内容以选择基于id和日期的临时表和排名:
+-----+--------------+------------+-------+
| ID | Start_date | End_date | Rank |
+-----+--------------+------------+-------+
| a | May '16 | May '16 | 5 |
| a | Jun '16 | Jul '16 | 4 |
| a | Jul '16 | Aug '16 | 3 |
| a | Aug '16 | NULL '16 | 2 |
| a | Sept '16 | NULL '16 | 1 |
+-----+--------------+------------+-------+
以下部分不起作用......
SELECT
[Start_Date] as 'start'
,[End_Date] as 'end'
,[Code] as 'code'
,[ID] as 'id'
,rank() over (partition by [id] order by [Start_Date]) as 'rank'
INTO #1
FROM [Table]
ORDER BY [id]
答案 0 :(得分:0)
假设您已经拥有问题中提供的数据集,这只是一个简单的自我join
,不是吗?
declare @t table(ID nvarchar(1), Start_date date, End_date date, [Rank] int);
insert into @t values ('a','20170501','20170501',5),('a','20170601','20170701',4),('a','20170701','20170801',3),('a','20170801',NULL,2),('a','20170901',NULL,1);
select t1.ID
,t1.Start_date
,isnull(t1.End_date,t2.Start_date) as End_date
-- If you *always* want to overwrite the End_Date use this instead:
-- ,t2.Start_date as End_date
,t1.[Rank]
from @t t1
left join @t t2
on(t1.[Rank] = t2.[Rank]+1);
输出:
+----+------------+------------+------+
| ID | Start_date | End_date | Rank |
+----+------------+------------+------+
| a | 2017-05-01 | 2017-05-01 | 5 |
| a | 2017-06-01 | 2017-07-01 | 4 |
| a | 2017-07-01 | 2017-08-01 | 3 |
| a | 2017-08-01 | 2017-09-01 | 2 |
| a | 2017-09-01 | NULL | 1 |
+----+------------+------------+------+