从下一行的StarDate驱动当前行的EndDate

时间:2012-11-07 12:13:13

标签: tsql

有人可以帮助我如何从开始日期创建结束日期。

产品指公司进行测试,而产品与公司在不同日期进行多次测试并记录测试日期以确定产品状况,即(结果ID)。 我需要建立StartDate,它是testDate和EndDate,它是下一行的开始日期。但是如果多个连续测试导致相同的OutcomeID,我需要返回一行,其中第一个测试的StartDate和最后一个测试的结束日期。换句话说,如果结果ID在几次连续测试中没有改变。 这是我的数据集


DECLARE @ProductTests TABLE

( RequestID int not null, ProductID int not null, TestID int not null, TestDate datetime null, OutcomeID int ) insert into @ProductTests (RequestID ,ProductID ,TestID ,TestDate ,OutcomeID ) select 1,2,22,'2005-01-21',10 union all select 1,2,42,'2007-03-17',10 union all select 1,2,45,'2010-12-25',10 union all select 1,2,325,'2011-01-14',13 union all select 1,2,895,'2011-08-10',15 union all select 1,2,111,'2011-12-23',15 union all select 1,2,636,'2012-05-02',10 union all select 1,2,554,'2012-11-08',17

- 从@producttests中选择*


RequestID   ProductID   TestID    TestDate        OutcomeID
1               2           22    2005-01-21         10
1               2           42    2007-03-17         10
1               2           45    2010-12-25         10
1               2           325   2011-01-14         13
1               2           895   2011-08-10         15
1               2           111   2011-12-23         15
1               2           636   2012-05-02         10
1               2           554   2012-11-08         17
这就是我需要实现的目标。


RequestID ProductID  StartDate        EndDate           OutcomeID
1            2       2005-01-21       2011-01-14        10
1            2       2011-01-14       2011-08-10        13
1            2       2011-08-10       2012-05-02        15
1            2       2012-05-02       2012-11-08        10
1            2       2012-11-08       NULL              17

正如您从数据集中看到的那样,前三个测试(22,42和45)都导致了OutcomeID 10,所以在我的结果中我只需要测试22的开始日期和测试45的结束日期,这是开始日期测试325.正如您在测试636中看到的那样,结果ID从15开始回到10,所以它也需要返回。

- 这是我目前使用以下脚本

设法实现的目标

select T1.RequestID,T1.ProductID,T1.TestDate AS StartDate
       ,MIN(T2.TestDate) AS EndDate ,T1.OutcomeID 
from   @producttests T1
left join @ProductTests T2 ON T1.RequestID=T2.RequestID 
and T1.ProductID=T2.ProductID and T2.TestDate>T1.TestDate

group by T1.RequestID,T1.ProductID ,T1.OutcomeID,T1.TestDate

order by T1.TestDate

结果:


RequestID   ProductID   StartDate   EndDate       OutcomeID
1                  2    2005-01-21  2007-03-17         10
1                  2    2007-03-17  2010-12-25         10
1                  2    2010-12-25  2011-01-14         10
1                  2    2011-01-14  2011-08-10         13
1                  2    2011-08-10  2011-12-23         15
1                  2    2011-12-23  2012-05-02         15
1                  2    2012-05-02  2012-11-08         10
1                  2    2012-11-08  NULL               17

2 个答案:

答案 0 :(得分:0)

nov 7但仍然没有回答 所以这是我的解决方案 不太漂亮,但工作

我的提示是关于窗口,排名和聚合函数,如row_number,rank,avg,sum等。 当你想编写raports并在sql server 2012中变得非常强大时,这些是必不可少的

我也使用过CTE(公用表表达式)但它可以写成子查询或临时表

;with cte ( ida, requestid, productid, testid, testdate, outcomeid) as
(
-- select rows where the outcome id is changing 
select b.* from 
(select  ROW_NUMBER() over( partition by requestid, productid order by testDate) as id, * from #ProductTests)a 
right outer join 
(select  ROW_NUMBER() over(partition by requestid, productid order by testDate) as id, * from #ProductTests) b
on a.requestID = b.requestID and a.productID = b.productID and a.id +1  = b.id 
where 1=1 
--or a.id = 1
and a.outcomeid <> b.outcomeid or b.outcomeid is null or a.id is null
)
select --*
a.RequestID,a.ProductID,a.TestDate AS StartDate   ,MIN(b.TestDate) AS EndDate ,a.OutcomeID  
from  cte a left join cte b on a.requestid = b.requestid and a.productid = b.productid and a.testdate < b.testdate
group by a.RequestID,a.ProductID ,a.OutcomeID,a.TestDate
order by StartDate

答案 1 :(得分:0)

实际上,你的问题似乎有两个问题。一种是如何对包含相同值的顺序(基于特定标准)进行分组。另一个是你的标题中实际拼写的那个,即如何使用下一行的StartDate作为当前行的EndDate。

就个人而言,我会按照我提到的顺序解决这两个问题,所以我首先要解决分组问题。在这种情况下正确分组数据的一种方法是使用这样的双重排名:

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT *
FROM grouped
;

这个should give您的数据样本输出如下:

RequestID  ProductID  StartDate   OutcomeID
---------  ---------  ----------  ---------
1          2          2005-01-21  10
1          2          2011-01-14  13
1          2          2011-08-10  15
1          2          2012-05-02  10
1          2          2012-11-08  17

显然,有一件事仍然缺失,它是EndDate,现在是关心它的正确时机。再次使用ROW_NUMBER(),对grouped CTE的结果集进行排名,然后在连接结果集时使用连接条件中的排名(使用外部联接):

WITH partitioned AS (
  SELECT
    *,
    grp = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID            ORDER BY TestDate)
        - ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID, OutcomeID ORDER BY TestDate)
  FROM @ProductTests
)
, grouped AS (
  SELECT
    RequestID,
    ProductID,
    StartDate = MIN(TestDate),
    OutcomeID,
    rnk = ROW_NUMBER() OVER (PARTITION BY RequestID, ProductID ORDER BY MIN(TestDate))
  FROM partitioned
  GROUP BY
    RequestID,
    ProductID,
    OutcomeID,
    grp
)
SELECT
  g1.RequestID,
  g1.ProductID,
  g1.StartDate,
  g2.StartDate AS EndDate,
  g1.OutcomeID
FROM grouped g1
LEFT JOIN grouped g2
  ON g1.RequestID = g2.RequestID
 AND g1.ProductID = g2.ProductID
 AND g1.rnk = g2.rnk - 1
;

您可以尝试此查询at SQL Fiddle来验证它是否会返回您之后的输出。