我正在尝试找出一个postgresql代码,用于在每个最小时间戳开始的时间间隔(例如5秒)内将数据表分组。
CREATE TABLE foo AS SELECT
timestamp::timestamp with time zone,
name::text
FROM ( VALUES
('2010-11-16 10:32:22', 'John'),
('2010-11-16 10:32:23', 'John'),
('2010-11-16 10:32:25', 'John'),
('2010-11-16 10:32:27', 'John'),
('2010-11-16 10:32:27', 'John'),
('2010-11-16 10:32:29', 'John'),
('2010-11-16 10:37:45', 'John'),
('2010-11-16 10:37:45', 'John'),
('2010-11-16 10:37:46', 'John'),
('2010-11-16 10:38:08', 'John')
) AS t(timestamp, name);
从测试数据
timestamp name
------------------- ----
2010-11-16 10:32:22 John
2010-11-16 10:32:23 John
2010-11-16 10:32:25 John
2010-11-16 10:32:27 John
2010-11-16 10:32:27 John
2010-11-16 10:32:29 John
2010-11-16 10:37:45 John
2010-11-16 10:37:45 John
2010-11-16 10:37:46 John
2010-11-16 10:38:08 John
期望的结果应该如下:
timestamp name
------------------- ----
2010-11-16 10:32:22 John
2010-11-16 10:32:27 John
2010-11-16 10:37:45 John
2010-11-16 10:38:08 John
注意:时间间隔基于时间戳的第一次出现,而不是所讨论的一般时间间隔here
答案 0 :(得分:2)
这就是你想要的。
首先我们计算一个groupid。
timestamp-min(timestamp) OVER ()
:这是一个窗口函数,用于计算间隔 - 当前时间与最小时间戳之间的持续时间。extract(epoch from INTERVAL)
:然后我们以秒为单位提取该间隔。floor( SECONDS /5)
:除以5秒,然后向下舍入这是查询,
SELECT floor(extract(epoch from (timestamp-min(timestamp) OVER ()))/5) AS groupid
, *
FROM foo
然后我们将其作为子查询包装,并从按时间戳降序排序的每个组中选择一个不同的行。
WITH t AS (
SELECT
floor(extract(epoch from timestamp-min(timestamp) OVER ()) /5) AS groupid, *
FROM foo
)
SELECT DISTINCT ON (groupid) timestamp, name
FROM t
ORDER BY groupid, timestamp;
请注意,我们不会在任何地方执行GROUP BY
。这是因为你要返回整行。没有必要。
正如@ypercube(TM)指出的那样,
因此,如果您将10:37:45
更改为10:37:41
,您会发现10:37:41
与10:37:45
位于不同的组中。
答案 1 :(得分:-1)
我们的想法是采用最小值和时间戳的差异。您可以使用窗口函数计算最小值。这会产生interval
类型,您可以从中提取秒数(使用epoch
)。
最后,添加最小时间戳以获得您想要的内容。
我不确定姓名来自哪里,但这是个想法:
select (min_timestamp +
floor(extract(epoch from (timestamp - min_timestamp)) / 5)*5 * interval '1 second'
) as timestamp,
min(name)
from (select t.*, min(timestamp) over () as min_timestamp
from t
) t
group by (min_timestamp +
floor(extract(epoch from (timestamp - min_timestamp)) / 5)*5 * interval '1 second'
);
以下是示例代码;
with t(timestamp, name) as (
SELECT
timestamp::timestamp with time zone,
name::text
FROM ( VALUES
('2010-11-16 10:32:22', 'John'),
('2010-11-16 10:32:23', 'John'),
('2010-11-16 10:32:25', 'John'),
('2010-11-16 10:32:27', 'John'),
('2010-11-16 10:32:27', 'John'),
('2010-11-16 10:32:29', 'John'),
('2010-11-16 10:37:45', 'John'),
('2010-11-16 10:37:45', 'John'),
('2010-11-16 10:37:46', 'John'),
('2010-11-16 10:38:08', 'John')
) foo(timestamp, name)
)
select (min_timestamp +
floor(extract(epoch from (timestamp - min_timestamp)) / 5) *5* interval '1 second'
) as timestamp,
min(name)
from (select t.*, min(timestamp) over () as min_timestamp
from t
) t
group by (min_timestamp +
floor(extract(epoch from (timestamp - min_timestamp)) / 5)*5 * interval '1 second'
);