Postgresql时间序列间隔

时间:2017-01-25 19:03:13

标签: sql postgresql time-series

我正在使用postgres(RDS)存储时间序列数据。

假设我的数据如下:

  • 时间戳:(索引和分区键)
  • 来源:整数指数
  • data:二进制json包含数据
timestamp           | source   |  data
---------------------+----------+------------------
 2017-01-24 19:24:41 |  1       | { some jsonb }
 2017-01-24 19:25:41 |  1       | { some jsonb }
 2017-01-24 19:25:41 |  2       | { some jsonb }
 2017-01-24 19:26:41 |  3       | { some jsonb }
 2017-01-24 19:32:41 |  1       | { some jsonb }
 2017-01-24 19:33:41 |  2       | { some jsonb }
 2017-01-24 19:45:41 |  3       | { some jsonb }
 2017-01-24 19:50:41 |  1       | { some jsonb }
 2017-01-24 19:56:41 |  1       | { some jsonb }
 2017-01-24 20:01:41 |  1       | { some jsonb }

我想按source对数据进行排序,并按间隔分割数据,这意味着分开15分钟。 我也希望round将它分成间隔时间。

到目前为止,我得到了

SELECT date_trunc('hour', timestamp) + date_part('minute', timestamp)::int / 15 * interval '15 min' AS fifteen_minutes, data
FROM MY_TABLE
where source=1
GROUP BY data, fifteen_minutes
ORDER BY fifteen_minutes desc

返回

fifteen_minutes      | source   |  data
---------------------+----------+------------------
 2017-01-24 19:15:00 |  1       | { some jsonb }
 2017-01-24 19:15:00 |  1       | { some jsonb }
 2017-01-24 19:30:00 |  1       | { some jsonb }
 2017-01-24 19:45:00 |  1       | { some jsonb }
 2017-01-24 19:45:00 |  1       | { some jsonb }
 2017-01-24 20:00:00 |  1       | { some jsonb }

问题是我每个间隔仍然会得到多个结果。我想按时间间隔distinct获得最接近的时间戳

理想情况下,我想得到:(每个间隔一次结果)

fifteen_minutes      | source   |  data
---------------------+----------+------------------
 2017-01-24 19:15:00 |  1       | { some jsonb }
 2017-01-24 19:30:00 |  1       | { some jsonb }
 2017-01-24 19:45:00 |  1       | { some jsonb }
 2017-01-24 20:00:00 |  1       | { some jsonb }

有什么好主意吗? 谢谢!

1 个答案:

答案 0 :(得分:1)

select distinct on (fifteen_minutes, source)
    fifteen_minutes, source, data
from (
    select 
        to_timestamp((extract(epoch from timestamp) / (15 * 60))::int * 15 * 60) as fifteen_minutes,
        data, timestamp
    from t
) t
order by
    fifteen_minutes, source,
    abs(extract(epoch from timestamp) - extract(epoch from fifteen_minutes))