PostgreSQL随时间推移的差异

时间:2016-06-20 15:32:31

标签: postgresql window-functions

我每分钟收集一系列数据并存储在Postgres数据库中。我正在尝试创建一个查询,该查询将显示该5分钟内值的差异。

所以我有以下数据:

val         created_at         
115414.568  2016-06-18 18:29:53
115443.656  2016-06-18 18:30:53
115461.817  2016-06-18 18:31:53
115494.406  2016-06-18 18:32:53
115527.151  2016-06-18 18:33:53
115550.096  2016-06-18 18:34:53
115610.065  2016-06-18 18:35:53
115640.957  2016-06-18 18:36:53
115667.033  2016-06-18 18:37:53
115683.302  2016-06-18 18:38:53
115727.717  2016-06-18 18:39:53
115748.331  2016-06-18 18:40:53
115763.520  2016-06-18 18:41:53
115795.607  2016-06-18 18:42:53
115849.592  2016-06-18 18:43:53
115871.538  2016-06-18 18:44:53
115908.999  2016-06-18 18:45:53
115923.776  2016-06-18 18:46:53
115961.043  2016-06-18 18:47:53
115988.369  2016-06-18 18:48:53
116003.320  2016-06-18 18:49:53
116056.299  2016-06-18 18:50:53
116069.396  2016-06-18 18:51:53
116092.485  2016-06-18 18:52:53
116137.878  2016-06-18 18:53:53
116162.937  2016-06-18 18:54:53
116204.077  2016-06-18 18:55:53
116235.593  2016-06-18 18:56:53
116242.502  2016-06-18 18:57:53
116285.713  2016-06-18 18:58:53
116317.299  2016-06-18 18:59:53
116340.120  2016-06-18 19:00:53
116387.000  2016-06-18 19:01:53

我想要以下分组:

2016-06-18 18:25:00  ... 
2016-06-18 18:30:00  166.409
2016-06-18 18:35:00  138.266
2016-06-18 18:40:00  160.668
2016-06-18 18:45:00  147.300
2016-06-18 18:50:00  147.778
2016-06-18 18:55:00  136.043
2016-06-18 19:00:00  ...

我设法将以下内容拼凑在一起:

SELECT
  val,
  first_value(val) over (partition by period_start) as first_value,
  period_start, 
  created_at
FROM (
  SELECT
    date_trunc('minute', created_at) - (EXTRACT(MINUTE FROM created_at)::INTEGER % 5) * INTERVAL '1 minute' AS period_start, 
    concat(kwh, '.', LPAD(wh::text, 3, '0'))::FLOAT as val,
    "readings"."created_at"
  FROM
    readings
  WHERE
    "readings"."created_at" between '2016-06-18 18:29:53' AND '2016-06-18 19:02:53'
) s1

这给了我以下内容:

val         first_value period_start        created_at
115414.568  115414.568  2016-06-18 18:25:00 2016-06-18 18:29:53.121609
115443.656  115443.656  2016-06-18 18:30:00 2016-06-18 18:30:53.124389
115461.817  115443.656  2016-06-18 18:30:00 2016-06-18 18:31:53.127074
115494.406  115443.656  2016-06-18 18:30:00 2016-06-18 18:32:53.129728
115527.151  115443.656  2016-06-18 18:30:00 2016-06-18 18:33:53.1324
115550.096  115443.656  2016-06-18 18:30:00 2016-06-18 18:34:53.135078
115610.065  115610.065  2016-06-18 18:35:00 2016-06-18 18:35:53.137708
115640.957  115610.065  2016-06-18 18:35:00 2016-06-18 18:36:53.140347
115667.033  115610.065  2016-06-18 18:35:00 2016-06-18 18:37:53.143023
115683.302  115610.065  2016-06-18 18:35:00 2016-06-18 18:38:53.145754
115727.717  115610.065  2016-06-18 18:35:00 2016-06-18 18:39:53.14852
115748.331  115748.331  2016-06-18 18:40:00 2016-06-18 18:40:53.151326
115763.520  115748.331  2016-06-18 18:40:00 2016-06-18 18:41:53.154003
115795.607  115748.331  2016-06-18 18:40:00 2016-06-18 18:42:53.156723
115849.592  115748.331  2016-06-18 18:40:00 2016-06-18 18:43:53.159454
115871.538  115748.331  2016-06-18 18:40:00 2016-06-18 18:44:53.162127
115908.999  115908.999  2016-06-18 18:45:00 2016-06-18 18:45:53.164743
115923.776  115908.999  2016-06-18 18:45:00 2016-06-18 18:46:53.167401
115961.043  115908.999  2016-06-18 18:45:00 2016-06-18 18:47:53.169997
115988.369  115908.999  2016-06-18 18:45:00 2016-06-18 18:48:53.17265
116003.320  115908.999  2016-06-18 18:45:00 2016-06-18 18:49:53.175299
116056.299  116056.299  2016-06-18 18:50:00 2016-06-18 18:50:53.17797
116069.396  116056.299  2016-06-18 18:50:00 2016-06-18 18:51:53.180955
116092.485  116056.299  2016-06-18 18:50:00 2016-06-18 18:52:53.183606
116137.878  116056.299  2016-06-18 18:50:00 2016-06-18 18:53:53.186317
116162.937  116056.299  2016-06-18 18:50:00 2016-06-18 18:54:53.189088
116204.077  116204.077  2016-06-18 18:55:00 2016-06-18 18:55:53.191821
116235.593  116204.077  2016-06-18 18:55:00 2016-06-18 18:56:53.194513
116242.502  116204.077  2016-06-18 18:55:00 2016-06-18 18:57:53.197222
116285.713  116204.077  2016-06-18 18:55:00 2016-06-18 18:58:53.199996
116317.299  116204.077  2016-06-18 18:55:00 2016-06-18 18:59:53.208784
116340.120  116340.120 2016-06-18 19:00:00 2016-06-18 19:00:53.217547
116387.000  116340.120 2016-06-18 19:00:00 2016-06-18 19:01:53.226262

所以我猜下一步是从第一个2016-06-18 18:35:00扣除第一个2016-06-18 18:30:00的值,但我不知道怎么做(可能使用Window功能) - 除非我有一种更简单的方法可以忽略它?

任何提示赞赏。

2 个答案:

答案 0 :(得分:0)

这可能是重新采样/插值的问题。您希望每5分钟获得一个值,您可以在具有窗口函数的成功行之间做差异。

重采样和插值是一个相当常见的“数据科学”问题,但在PostGres中难以解决。 This great articles为这个问题提供了很多答案。重要的是:

  • generate_series
  • linear_interpolate
  • 窗口功能

答案 1 :(得分:0)

实际上这并不像预期那么复杂。

我已经知道每次读取的时间间隔为15分钟,所以我只使用WHERE子句来查找该值为0的所有行。

从那里,我可以使用LAG函数来扣除先前的值并获得差异

SELECT
  val - (lag(val) over (order by created_at)) as diff,
  val,
  created_at
FROM (
  SELECT
    EXTRACT(MINUTE FROM created_at)::INTEGER % 5 as minutes_into_period,
    concat(val, '.', LPAD(wh::text, 3, '0'))::FLOAT as val,
    "readings"."created_at"
  FROM
    readings
  WHERE
    "readings"."created_at" between '2016-06-18 18:29:53' AND '2016-06-18 19:02:53'
  ) s1
WHERE minutes_into_period = 0

给我

diff              val         created_at
                  115443.656  2016-06-18 18:30:53.124389
166.409           115610.065  2016-06-18 18:35:53.137708
138.266000000003  115748.331  2016-06-18 18:40:53.151326
160.667999999991  115908.999  2016-06-18 18:45:53.164743
147.300000000003  116056.299  2016-06-18 18:50:53.17797
147.778000000006  116204.077  2016-06-18 18:55:53.191821
136.042999999991  116340.12   2016-06-18 19:00:53.217547