按浮动日期范围分组

时间:2013-09-13 09:04:30

标签: sql postgresql grouping window-functions group-by

我正在使用PostgreSQL 9.2 我有一些表,其中包含一些设备停止服务的时间。

+----------+----------+---------------------+
| event_id |  device  |         time        |
+----------+----------+---------------------+
|        1 | Switch4  | 2013-09-01 00:01:00 |
|        2 | Switch1  | 2013-09-01 00:02:30 |
|        3 | Switch10 | 2013-09-01 00:02:40 |
|        4 | Switch51 | 2013-09-01 03:05:00 |
|        5 | Switch49 | 2013-09-02 13:00:00 |
|        6 | Switch28 | 2013-09-02 13:01:00 |
|        7 | Switch9  | 2013-09-02 13:02:00 |
+----------+----------+---------------------+

我希望将行分组+/- 3分钟的时差,如下:

+----------+----------+---------------------+--------+
| event_id |  device  |         time        |  group |
+----------+----------+---------------------+--------+
|        1 | Switch4  | 2013-09-01 00:01:00 |      1 |
|        2 | Switch1  | 2013-09-01 00:02:30 |      1 |
|        3 | Switch10 | 2013-09-01 00:02:40 |      1 |
|        4 | Switch51 | 2013-09-01 03:05:00 |      2 |
|        5 | Switch49 | 2013-09-02 13:00:00 |      3 |
|        6 | Switch28 | 2013-09-02 13:01:00 |      3 |
|        7 | Switch9  | 2013-09-02 13:02:00 |      3 |
+----------+----------+---------------------+--------+

我尝试使用窗口功能,但在子句

  

[RANGE | ROWS] BETWEEN frame_start和frame_end,   其中frame_start和frame_end可以是其中之一   UNBOUNDED PRECEDING value PRECEDING CURRENT ROW值跟随   以下是无限制的,

     

value必须是不包含任何变量的整数表达式,   聚合函数或窗口函数

所以,考虑到这一点,我无法指出时间间隔。现在我怀疑窗口功能可以解决我的问题。你能帮帮我吗?

5 个答案:

答案 0 :(得分:5)

SQL Fiddle

select
    event_id, device, ts,
    floor(extract(epoch from ts) / 180) as group
from t
order by ts

使用窗口函数可以使组编号从1开始,但是如果有必要,我不知道这是一个不小的代价。这是它

select
    event_id, device, ts,
    dense_rank() over(order by "group") as group
from (
    select
        event_id, device, ts,
        floor(extract(epoch from ts) / 180) as group
    from t
) s
order by ts

time是保留字。选择另一个作为列名。

答案 1 :(得分:1)

SQLFiddle

with u as (
select 
   *,
   extract(epoch from ts - lag(ts) over(order by ts))/ 60 > 180 or lag(ts) over(order by ts) is null as test
from
   t
   )

   select *, sum(test::int) over(order by ts) from u

答案 2 :(得分:1)

这比@Clodoaldo's basically good answer略有改进。

获取连续组号:

SELECT event_id, device, ts
      ,dense_rank() OVER (ORDER BY trunc(extract(epoch from ts) / 180)) AS grp
FROM   tbl
ORDER  BY ts
  • 使用ts代替(部分)保留字time是个好建议。所以不要使用reserved word group。改为使用grp

  • 序列号可以没有子查询。

  • 使用trunc()代替floor()。两者都很好,trunc()slightly faster

答案 3 :(得分:0)

http://www.depesz.com/2010/09/12/how-to-group-messages-into-chats/

应该使用窗口。这是教科书中的一个例子

with
  xinterval( val ) as ( select 2 ),
  data( id, t ) as 
  (
    values  

      ( 1000, 1 ),
      ( 1001, 2 ),
      ( 1002, 3 ),

      ( 1000, 7 ),
      ( 1003, 8 )

  ),  
  x( id, t, tx ) as
  (
    select id, t,
      case (t - lag(t) over (order by t)) > xinterval.val
        when true then t when null then t
      end
    from data natural join xinterval
  ),
  xx( id, t, t2 ) as
  (
    select id, t, max(tx) over (order by t) from x
  )
select id, t, text( min(t) over w ) || '-' || text( max(t) over w ) as xperiod
from xx
window w as ( partition by t2 )
order by t

答案 4 :(得分:0)

制作功能

CREATE OR REPLACE FUNCTION public.date_round (
  base_date timestamp,
  round_interval interval
)
RETURNS TIMESTAMP WITHOUT TIME ZONE AS
$body$
DECLARE
   res TIMESTAMP;
BEGIN   
    res := TIMESTAMP 'epoch' + (EXTRACT(epoch FROM $1)::INTEGER + EXTRACT(epoch FROM $2)::INTEGER / 2)
                / EXTRACT(epoch FROM $2)::INTEGER * EXTRACT(epoch FROM $2)::INTEGER * INTERVAL '1 second';            
    IF (base_date > res ) THEN
        res := res + $2;
    END IF;
    RETURN res;
END;
$body$
LANGUAGE 'plpgsql'
STABLE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

按此功能结果分组

SELECT t.* FROM (SELECT p.oper_date, date_round(p.oper_date, '5 minutes') as grp FROM test p) t GROUP BY t.grp

这很简单:)