T-SQL查找局部最大值-运行值的最大值之和

时间:2018-08-28 11:09:25

标签: sql sql-server tsql running-total

有一个表的运行值有时会重置:

--------------------------------------
|         Time           |   Value   |
--------------------------------------
|2018-08-11 00:16:00.000 |     4     |
|2018-08-11 00:17:00.000 |     8     |
|2018-08-11 00:18:00.000 |     12    |
|2018-08-11 00:19:00.000 |     16    |
|2018-08-11 00:20:00.000 |     27    |
|2018-08-11 00:21:00.000 |     0     |   -- Doesn't have to be neccessary 0
|2018-08-11 00:22:00.000 |     3     |
|2018-08-11 00:23:00.000 |     5     |
|2018-08-11 00:24:00.000 |     4     |   -- Even going down, not passing the limit value
|2018-08-11 00:25:00.000 |     12    |
|2018-08-11 00:26:00.000 |     18    |
--------------------------------------

我正在尝试实现所有本地Maxima的总和。它会在重置之前找到最大的元素-简单地是:27518

但是:还有一种特殊情况,即局部最大值应忽略小的襟翼(因为运行值有时可能会低一些)。在上面的示例中,它应该忽略值5,因为下一个值是4,然后它会继续增长。 实际结果将是: 2718


结果:27 + 18 = 45


示例SQL

CREATE TABLE Data (
  [Time] [datetime] NOT NULL,
  [Value] [real] NOT NULL
);

INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:16:00' AS DATETIME),'4' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:17:00' AS DATETIME),'8' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:18:00' AS DATETIME),'12');
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:19:00' AS DATETIME),'16');
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:20:00' AS DATETIME),'27');
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:21:00' AS DATETIME),'0' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:22:00' AS DATETIME),'3' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:23:00' AS DATETIME),'5' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:24:00' AS DATETIME),'4' );
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:25:00' AS DATETIME),'12');
INSERT INTO Data ([Time], [Value]) VALUES(CAST('2018-08-11 00:26:00' AS DATETIME),'18');

提议的解决方案/我已经尝试过的方法:我考虑过尝试通过ROW_NUMBER()上的Time来查找局部最大值,并将同一表与+1行连接起来数。然后,我可以比较2个值,如果差距太大,我将忽略该事实。但是,此处未选择最后一个条目。而且我不太确定优化/建议的解决方案是否可以按预期工作。

WITH TAB0 AS (
    SELECT
        *, rn = ROW_NUMBER() OVER (ORDER BY Time)
    FROM
        Data
)
SELECT
    t1.Time,
    t1.Value as MT1,
    t2.Value as MT2
FROM
    TAB0 t1
    INNER JOIN TAB0 t2 ON t2.rn = t1.rn + 1
        AND (t2.Value + 1) < t1.Value                --put the limit here instead of "+1"
    ORDER BY t1.Time;

2 个答案:

答案 0 :(得分:3)

对于局部最大值,逻辑将是:

Differences

您的增强条件似乎主要是关于设置其他限制;这就是您所描述的:

select sum(value)
from (select d.*,
             lag(value) over (order by time) as prev_value,
             lead(value) over (order by time) as next_value
      from d
     ) d
where value > prev_value and value > next_value;

答案 1 :(得分:0)

按照戈登的回答,他使我走上了正确的道路,我们需要使用previousnext值。该查询的作用:

select d.*,
    lag(value) over (order by time) as prev_value,
    lead(value) over (order by time) as next_value
from data d

通过比较上一个和下一个值,我们可以过滤最大值(并在其中添加限制):

select (value)     -- Gets all the values, if there is " SUM(Value) " - sum of all values will be get
from (select d.*,
         lag(value) over (order by time) as prev_value,
         lead(value) over (order by time) as next_value
  from data d
  ) d
where value > (prev_value +1) and value > (next_value +1);

/*   +1 in the last row sets the limit to 1      */

现在我们需要处理极限情况,只需将NULL设置为0即可完成。


完整查询:

select SUM(value)
from (select d.*,
         ISNULL(lag(value) over (order by time), 0) as prev_value,
         ISNULL(lead(value) over (order by time), 0) as next_value
  from data d
) d
where value > (prev_value + 1) and value > (next_value + 1);