MSSQL:在时间序列中识别一系列不变(flatline)值

时间:2014-01-28 19:29:38

标签: sql sql-server loops time-series

我正在处理传感器测量结果的时间序列数据。我需要确定数据平坦的情况 - 表明传感器出现故障。我想选择过去24小时内连续超过3个不变值的位置。

我想我可能需要循环,但我没有在sql中使用循环。我假设我需要使用子查询来ORDER BY DateTime。我也看过LEAD和LAG。此外,我需要通过SiteID和VariableID来区分,我认为可以使用PARTITION来完成。

数据如下:

**SiteID**VariableID**DateTime**Value**
   5    1   2014-01-27 12:15    5.576
   5    1   2014-01-27 12:30    5.487
   5    1   2014-01-27 12:45    5.573
   5    1   2014-01-27 13:00    5.903
   5    87  2014-01-27 12:15    -273.2
   5    87  2014-01-27 12:30    -273.2
   5    87  2014-01-27 12:45    -273.2
   5    87  2014-01-27 13:00    -273.2
   5    88  2014-01-27 12:15    -273.2
   5    88  2014-01-27 12:30    -273.2
   5    88  2014-01-27 12:45    -273.2
   5    88  2014-01-27 13:00    -273.2
   5    89  2014-01-27 12:15    -273.2
   5    89  2014-01-27 12:30    -273.2
   5    89  2014-01-27 12:45    -273.2
   5    89  2014-01-27 13:00    -273.2
   5    2   2014-01-27 12:15    30.61
   5    2   2014-01-27 12:30    38.73
   5    2   2014-01-27 12:45    32.84
   5    2   2014-01-27 13:00    31.62
   5    3   2014-01-27 12:15    -9.53
   5    3   2014-01-27 12:30    -8.61
   5    3   2014-01-27 12:45    -8.76
   5    3   2014-01-27 13:00    -9.32
   5    4   2014-01-27 12:15    0.298
   5    4   2014-01-27 12:30    0.32
   5    4   2014-01-27 12:45    0.317
   5    4   2014-01-27 13:00    0.302

我想生成类似的东西:

**SiteID**VariableID**StartingDateTime**ValueCount**Value**
    5          87     2014-1-27 12:15       4         -273.4
    5          88     2014-1-27 12:15       4         -273.4
    5          89     2014-1-27 12:15       4         -273.4

1 个答案:

答案 0 :(得分:1)

SQL Fiddle

使用此架构和数据(稍加修改,只是为了确保一切正常):

CREATE TABLE TimeSeries (
  SiteId INT,
  VariableId INT,
  DateTime DATETIME,
  Value NUMERIC(15,5)
);

INSERT INTO TimeSeries VALUES (    5, 1   , '2014-01-27 12:15' ,   5.576     );
INSERT INTO TimeSeries VALUES (    5, 1   , '2014-01-27 12:30' ,   5.487     );
INSERT INTO TimeSeries VALUES (    5, 1   , '2014-01-27 12:45' ,   5.573     );
INSERT INTO TimeSeries VALUES (    5, 1   , '2014-01-27 13:00' ,   5.903     );
INSERT INTO TimeSeries VALUES (    5, 87  , '2014-01-27 12:15' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 87  , '2014-01-27 12:30' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 87  , '2014-01-27 12:45' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 87  , '2014-01-27 13:00' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 88  , '2014-01-27 12:15' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 88  , '2014-01-27 12:30' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 88  , '2014-01-27 12:45' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 88  , '2014-01-27 13:00' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 12:15' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 12:30' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 12:45' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 13:00' ,   -273.2    );
INSERT INTO TimeSeries VALUES (    5, 2   , '2014-01-27 12:15' ,   30.61     );
INSERT INTO TimeSeries VALUES (    5, 2   , '2014-01-27 12:30' ,   38.73     );
INSERT INTO TimeSeries VALUES (    5, 2   , '2014-01-27 12:45' ,   32.84     );
INSERT INTO TimeSeries VALUES (    5, 2   , '2014-01-27 13:00' ,   31.62     );
INSERT INTO TimeSeries VALUES (    5, 3   , '2014-01-27 12:15' ,   -9.53     );
INSERT INTO TimeSeries VALUES (    5, 3   , '2014-01-27 12:30' ,   -8.61     );
INSERT INTO TimeSeries VALUES (    5, 3   , '2014-01-27 12:45' ,   -8.76     );
INSERT INTO TimeSeries VALUES (    5, 3   , '2014-01-27 13:00' ,   -9.32     );
INSERT INTO TimeSeries VALUES (    5, 4   , '2014-01-27 12:15' ,   0.298     );
INSERT INTO TimeSeries VALUES (    5, 4   , '2014-01-27 12:30' ,   0.32      );
INSERT INTO TimeSeries VALUES (    5, 4   , '2014-01-27 12:45' ,   0.317     );
INSERT INTO TimeSeries VALUES (    5, 4   , '2014-01-27 13:00' ,   0.302     );

-- Just to make sure the query works
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 18:30' ,   10        );
INSERT INTO TimeSeries VALUES (    5, 89  , '2014-01-27 19:00' ,   -273.2    ); -- this is not a contiguous value

查询:

WITH Sequences AS (
  SELECT
    T.*,
    ROW_NUMBER() OVER (PARTITION BY SiteId, VariableId, Value ORDER BY DateTime) AS RNO,
    ROW_NUMBER() OVER (ORDER BY SiteId, VariableId, DateTime) AS RNE
  FROM
    TimeSeries T
)
SELECT
  S.SiteId,
  S.VariableId,
  S.Value,
  MIN(S.DateTime) AS [Start],
  MAX(S.DateTime) AS [End],
  COUNT(*) AS ValueCount
FROM
  Sequences S
GROUP BY
  S.SiteId,
  S.VariableId,
  S.Value,
  S.RNE - S.RNO
HAVING
  COUNT(*) > 1

Results

| SITEID | VARIABLEID |  VALUE |                          START |                            END | VALUECOUNT |
|--------|------------|--------|--------------------------------|--------------------------------|------------|
|      5 |         87 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |          4 |
|      5 |         88 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |          4 |
|      5 |         89 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |          4 |

您可以看到VariableId = 89只找到4条记录(因为我添加的最后2条记录不应该被考虑)。

基于this SO answerthis blog post