postgres: using previous row value when current row value is null

时间:2015-06-15 14:48:49

标签: postgresql

In my app I need to produce a data output (to be then converted to JSON) for the data in a postgres database (which is actually temperature data gathered by sensors).

The data is stored in device_history_log as follows (for data between 11am and noon):

row_id;deviceid;sensor_value_raw;last_update
104401;20865735;21.56;"2015-06-10 11:00:14"
104432;493417852;23.9;"2015-06-10 11:00:58"
104516;20865735;21.06;"2015-06-10 11:05:14"
104578;493417852;23.7;"2015-06-10 11:06:43"
104583;20865735;21.12;"2015-06-10 11:13:21"
104601;20865735;21.18;"2015-06-10 11:17:25"
104623;493417852;22.2;"2015-06-10 11:18:21"
104642;20865735;21.25;"2015-06-10 11:21:29"
104937;20865735;21.31;"2015-06-10 11:25:33"
105081;20865735;21.37;"2015-06-10 11:33:41"
105154;20865735;21.43;"2015-06-10 11:37:45"
105303;493417852;24;"2015-06-10 11:46:09"
105358;20865735;21.62;"2015-06-10 11:49:58"

I want to get 5 min averages for the data and so far I'm using this query:

SELECT grid.t5||'.000000' as ts, 
            CASE
                WHEN avg(t.sensor_value_raw) ISNULL THEN -1
                ELSE avg(t.sensor_value_raw) 
            END AS sensorvalue
FROM (

      SELECT date_trunc('min', time_series) as t5 
      FROM generate_series('2015-06-10 11:00:00'::timestamp, '2015-06-11 12:00:00', 
                   '5 min') as time_series

   ) grid
LEFT JOIN device_history_log t 
     ON t.last_update >= grid.t5 AND t.last_update <  grid.t5 +  interval '5 min' AND (t.deviceid = 493417852 or t.deviceid = 20865735)
GROUP  BY grid.t5
ORDER  BY grid.t5

The resulting data looks like this:

"2015-06-10 11:00:00.000000";23.8999996185303
"2015-06-10 11:05:00.000000";21.0599994659424
"2015-06-10 11:10:00.000000";21.1200008392334
"2015-06-10 11:15:00.000000";21.1800003051758
"2015-06-10 11:20:00.000000";21.25
"2015-06-10 11:25:00.000000";21.3099994659424
"2015-06-10 11:30:00.000000";21.3700008392334
"2015-06-10 11:35:00.000000";21.4300003051758
"2015-06-10 11:40:00.000000";-1
"2015-06-10 11:45:00.000000";22.8100004196167
"2015-06-10 11:50:00.000000";-1
"2015-06-10 11:55:00.000000";-1
"2015-06-10 12:00:00.000000";22.9250001907349

I need the averages as there are more than one sensors reporting back thus need to have an average value.

My question is if it is possible for the value of 10:40:00 instead of -1 to have the previous value as the temperature sensors only report a 'change' in the temperature so 'no data' means 'no change'.

Also for the first values that are -1 (thus no previous data) if possible I'd like them removed/ignored which I can do when parsing the data but if it can be done a query level even better.

Thus my ideal result set would be like this:

"2015-06-10 11:00:00.000000";23.8999996185303
"2015-06-10 11:05:00.000000";21.0599994659424
"2015-06-10 11:10:00.000000";21.1200008392334
"2015-06-10 11:15:00.000000";21.1800003051758
"2015-06-10 11:20:00.000000";21.25
"2015-06-10 11:25:00.000000";21.3099994659424
"2015-06-10 11:30:00.000000";21.3700008392334
"2015-06-10 11:35:00.000000";21.4300003051758
"2015-06-10 11:40:00.000000";21.4300003051758
"2015-06-10 11:45:00.000000";22.8100004196167
"2015-06-10 11:50:00.000000";22.8100004196167
"2015-06-10 11:55:00.000000";22.8100004196167
"2015-06-10 12:00:00.000000";22.9250001907349

Thank you!

1 个答案:

答案 0 :(得分:1)

以下查询填充原始查询的结果集中的值。该方法包括根据 empty 值的数量将数据拆分为分区,并从每个分区中选择第一个(非空)值(将select添加到select以查看其工作原理)。 / p>

WITH survey AS (

    SELECT grid.t5||'.000000' as ts, 
        CASE
            WHEN avg(t.sensor_value_raw) ISNULL THEN -1
            ELSE avg(t.sensor_value_raw) 
        END AS sensorvalue
    FROM (

          SELECT date_trunc('min', time_series) as t5 
          FROM generate_series('2015-06-10 11:00:00'::timestamp, '2015-06-11 12:00:00', 
                       '5 min') as time_series

       ) grid
    LEFT JOIN device_history_log t 
         ON t.last_update >= grid.t5 AND t.last_update <  grid.t5 +  interval '5 min' AND (t.deviceid = 493417852 or t.deviceid = 20865735)
    GROUP  BY grid.t5
    ORDER  BY grid.t5)

SELECT
    ts, first_value(sensorvalue) OVER (PARTITION BY part ORDER BY ts) sensorvalue
FROM (  
    SELECT *, sum((sensorvalue != -1)::int) OVER (ORDER BY ts) part
    FROM survey) alias