如何在postgres中将时间戳数据分成n分钟的桶

时间:2018-04-16 15:11:15

标签: postgresql

我有以下查询可以工作,将带有时间戳的“观察”分类到桶中,这些桶的边界由bin表定义:

SELECT
  count(id),
  width_bucket(
      time :: TIMESTAMP,
      (SELECT ARRAY(SELECT start_time
                    FROM bins
                    WHERE owner_id = 'some id'
                    ORDER BY start_time ASC) :: TIMESTAMP[])
  ) bucket
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket
ORDER BY bucket;

我想修改它以允许从指定的时间戳开始查询任意n分钟的bin,而不必从实际的“bin”表中提取。

也就是说,给定一个开始时间,以分钟为单位的“bin宽度”和多个bin,有没有办法生成时间戳数组以传递到width_bucket函数?

或者,是否有不同/更简单的方法来获得相同的结果?

2 个答案:

答案 0 :(得分:1)

使用函数generate_series(start, stop, step interval),例如

select array_agg(t) as result
from generate_series('2018-04-15 00:00'::timestamp, '2018-04-15 01:00', '30 minutes') t;

                               result                                
---------------------------------------------------------------------
 {"2018-04-15 00:00:00","2018-04-15 00:30:00","2018-04-15 01:00:00"}
(1 row)

答案 1 :(得分:0)

不带序列的另一种方法:

将时间差除以箱的宽度(示例中为5分钟),然后加1,因为width_bucket(...)的第一个存储桶为1而不是0。

floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket

也可以启动垃圾箱

to_timestamp(floor(extract(epoch from a.time) / (5 * 60)) * (5 * 60)) as bin_start

将所有内容放在一起:

SELECT
  count(id),
  floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket,
  to_timestamp(floor(extract(epoch from time) / (5 * 60)) * (5 * 60)) as bin_start
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket, bin_start
ORDER BY bucket;