获取每个'块中的最长日期。数据的

时间:2014-07-30 13:35:44

标签: sql-server

我有一个看起来像这样的表(按PersonID, Date排序时):

 PersonID | Stage       | Date
----------|-------------|------------
 12       | Start       | 01 Jan 2010
 12       | Step 1      | 03 Jan 2010
 12       | Step 2      | 05 Jan 2010
 12       | Start       | 06 Jan 2010
 12       | Step 1      | 07 Jan 2010
 12       | Start       | 09 Jan 2010
 ...

Stage基本上将活动分解为块 - 前三个记录是一个活动块,接下来的两个是另一个阶段,等等。每个块都以Start开头,但是没有定义结束(例如,第一个块结束Step 2,第二个结束Step 1,第三个结束可能结束Step 47Turnips或其他内容:

Block 1:
 PersonID | Stage       | Date
----------|-------------|------------
 12       | Start       | 01 Jan 2010
 12       | Step 1      | 03 Jan 2010
 12       | Step 2      | 05 Jan 2010

Block 2:
 PersonID | Stage       | Date
----------|-------------|------------
 12       | Start       | 06 Jan 2010
 12       | Step 1      | 07 Jan 2010

我需要查询每个区块的最大日期(例如,'区块1'是2010年1月5日,'区块2'是2010年1月7日),但我是不知道如何实现这个目标:

select
     PersonID
    ,max(Date)
from
    MyTable -- subquery / cte required?

group by
    PersonID

我甚至没有真正开始这个 - 我怀疑我需要一个子查询/ CTE来首先定义块,或者我可以用row_number() over (partition by ...)做一些事情?

4 个答案:

答案 0 :(得分:1)

我将数据放入OUTER APPLY

DECLARE @table TABLE (PersonID INT, Stage VARCHAR(100), Date DateTime2)

INSERT INTO @table SELECT  12, 'Start', '2010-01-01'
INSERT INTO @table SELECT  12, 'Step 1', '2010-01-03'
INSERT INTO @table SELECT  12, 'Step 2', '2010-01-05'
INSERT INTO @table SELECT  12, 'Start', '2010-01-06'
INSERT INTO @table SELECT  12, 'Step 1', '2010-01-07'
INSERT INTO @table SELECT  12, 'Start', '2010-01-09'
INSERT INTO @table SELECT  13, 'Start', '2010-01-01'
INSERT INTO @table SELECT  13, 'Step 1', '2010-01-03'
INSERT INTO @table SELECT  13, 'Step 2', '2010-01-05'
INSERT INTO @table SELECT  13, 'Start', '2010-01-06'
INSERT INTO @table SELECT  13, 'Step 1', '2010-01-07'
INSERT INTO @table SELECT  13, 'Start', '2010-01-09'

;WITH StartTable AS (

    SELECT PersonID, Stage, [Date], ROW_NUMBER() OVER (ORDER BY PersonID, Date) AS ROWID
    FROM @table
    WHERE Stage = 'Start'

)
SELECT PersonID, T2.MaxDate
FROM StartTable ST1
OUTER APPLY (
    SELECT MAX(TB.Date) AS MaxDate
    FROM StartTable ST2
    INNER JOIN @table TB 
        ON TB.PersonID = ST2.PersonID
        AND TB.Date < ST2.Date
    WHERE ST1.ROWID + 1 = ST2.ROWID
        AND ST1.PersonID = ST2.PersonID
    GROUP BY TB.PersonID
) AS T2

OR可以是这样的逻辑:

SELECT T1.PersonID, (
        SELECT MAX([Date])
        FROM @table T2 -- gives MAX date in stage
        WHERE T2.Stage != 'Start' AND T2.PersonID = T1.PersonID
            AND [Date] < (SELECT MIN(Date) FROM @table T3 WHERE T3.Stage = 'Start' AND T3.Date > T1.Date AND T3.PersonID = T1.PersonID) -- gives next date for starting stage
) 
FROM @table T1
WHERE Stage = 'Start'

答案 1 :(得分:0)

我认为这至少是一个好的开始: SQL Fiddle

with RNs as (
select
personid,
stage,
row_number() over (order by personid,[date]) as RN,
[date]
from
table1
) ,
Starts as (
select
personid,
stage,
rn,
[date]
from
  RNs
where
Stage = 'Start')

select
RNs.PersonId,
RNs.Stage,
RNs.[Date]
from
RNs
inner join starts
  on RNs.rn = starts.rn - 1

基本上,这可以让你在每个“开始”之前获得一行。行(基于按人物和日期列排序)。

答案 2 :(得分:0)

不是100%明确你想要什么,但这将获得每个&#34;分组&#34;的PersonID,max Date和StageNumber。请注意我将其更改为MyDate以避免命名问题。我建议您将这些数据标准化为至少几个表格,这样您就不必与多行&#34;类型&#34;在同一张桌子里。

declare @Something table
(
    PersonID int,
    Stage varchar(10),
    MyDate date
);

insert @Something
select 12, 'Start', '01 Jan 2010' union all
select 12, 'Step 1', '03 Jan 2010' union all
select 12, 'Step 2', '05 Jan 2010' union all
select 12, 'Start', '06 Jan 2010' union all
select 12, 'Step 1', '07 Jan 2010' union all
select 12, 'Start', '09 Jan 2010';

with MyOrderedData as
(
    select PersonID
        , MyDate
        , ROW_NUMBER() over(partition by Stage order by MyDate) as Stage
    from @Something
)

select PersonID, MAX(MyDate), Stage
from MyOrderedData
group by PersonID, Stage
order by Stage;

答案 3 :(得分:0)

试试这个

with cte2 as 
(
select rownum from 
(select row_number() over (order by PersonID, Date) as rownum, Stage, [Date] from        dbo.test)
temp1 where Stage = 'Start'
)
,
cte3 as 
(select row_number() over (order by PersonID, Date) as rownum, Stage, [Date] from    dbo.test)

select [DATE] from cte3 where rownum in (select rownum - 1 from cte2)

dbo.test是初始表