计算数据集的持续时间

时间:2015-10-30 05:05:16

标签: sql gaps-and-islands

我在SQL中有一组数据如下所示:

╔═══════════╦═══════╗
║ TimeStamp ║ State ║
╠═══════════╬═══════╣
║  7:10 AM  ║   A   ║
║  7:11 AM  ║   A   ║
║  7:12 AM  ║   A   ║
║  7:13 AM  ║   B   ║
║  7:14 AM  ║   B   ║
║  7:15 AM  ║   A   ║
║  7:16 AM  ║   A   ║
║  7:17 AM  ║   C   ║
║  7:18 AM  ║   C   ║
╚═══════════╩═══════╝

我正在尝试计算每个州的持续时间。但是,我想分离每个状态序列并分别计算它们的差异,将重复的状态分开。所以我希望上面的数据返回如下所示:

╔═══════╦════════════════════╗
║ State ║ Duration (minutes) ║
╠═══════╬════════════════════╣
║   A   ║         2          ║
║   B   ║         1          ║
║   A   ║         1          ║
║   C   ║         1          ║
╚═══════╩════════════════════╝

有人可以帮忙吗?如何编写一个返回此数据的SQL查询?

谢谢!

2 个答案:

答案 0 :(得分:0)

我将认为需要MS SQL Server。

为了达到预期的结果(C的持续时间为1):

select    
    state, MIN(TimeStamp) StartsAt, EndsAt, datediff(minute,MIN(TimeStamp),EndsAt) DurationMinutes
from (
        select
                t1.state, t1.TimeStamp
                , ISNULL(ca.EndsAt, (select max(timestamp) from table1) ) EndsAt
        from table1 t1
        outer apply (
                  select top (1) t2.timestamp as EndsAt
                  from table1 t2
                  where t1.state <> t2.state and t1.TimeStamp < t2.TimeStamp
                  order by t2.TimeStamp
              ) ca
    ) as derived
group by     
    state, EndsAt

对于数据样本,可以说C的持续时间未知,因为状态尚未改变。在这种情况下,它有点简单:

select    
    state, MIN(TimeStamp) StartsAt, EndsAt, datediff(minute,MIN(TimeStamp),EndsAt) DurationMinutes
from (
        select
                t1.state, t1.TimeStamp ,ca.EndsAt
        from table1 t1
        outer apply (
                  select top (1) t2.timestamp as EndsAt
                  from table1 t2
                  where t1.state <> t2.state and t1.TimeStamp < t2.TimeStamp
                  order by t2.TimeStamp
              ) ca
    ) as derived
group by     
    state, EndsAt

http://sqlfiddle.com/#!6/f0dd7e/9

答案 1 :(得分:0)

你没有提到RDBMS,所以这里是适用于任何数据库的答案。如果您需要快速解决方案,请提及您使用的SQL库,这样您就可以使用此类查询所需的一些特定功能/命令(访问上一个,下一个记录,...)。

SELECT MIN(timeStamp),MAX(timeStamp),State
FROM(
     SELECT TimeStamp,State,(SELECT count(*) FROM t 
                         where state<>t1.state 
                         and TimeStamp<t1.TimeStamp) as Grp 
                         from t as t1
     ) as t2
GROUP BY State,Grp

SQLFiddle demo