如何在postgresql中获得最大数量的并发事件?

时间:2017-10-24 10:23:42

标签: sql postgresql timespan

我有一个名为events的表格,如下所示:

id: int
source_id: int
start_datetime: timestamp
end_datetime: timestamp  

这些事件可能有重叠,我想知道在一段时间内发生的最大重叠事件数。例如,在这种情况下:

id | source_id | start_datetime     | end_datetime
----------------------------------------------------------
1  | 23        | 2017-1-1T10:20:00  | 2017-1-1T10:40:00
1  | 42        | 2017-1-1T10:30:00  | 2017-1-1T10:35:00
1  | 11        | 2017-1-1T10:37:00  | 2017-1-1T10:50:00  

答案是2,因为最多2个事件在10:30到10:35重叠 我正在使用Postgres 9.6

2 个答案:

答案 0 :(得分:1)

我不完全确定应该如何处理idsource_id列,但是根据您的描述,可能会这样:

select e1.source_id, 
       count(distinct e2.source_id) as overlap_count, 
       array_agg(e2.source_id) as overlap_events
from events e1
  join events e2 
    on e1.source_id <> e2.source_id
   and (e1.start_datetime, e1.end_datetime) overlaps (e2.start_datetime, e2.end_datetime) 
group by e1.source_id
order by overlap_count desc;

根据您的示例数据,返回以下内容:

source_id | overlap_count | overlap_events
----------+---------------+---------------
       23 |             2 | {42,11}       
       11 |             1 | {23}          
       42 |             1 | {23}          

要仅获取最大行,您可以向查询添加limit 1

另一个(可能更慢)选项,如果您需要事件表中的完整行:

select e1.id, e1.source_id, e1.start_datetime, e1.end_datetime, 
       (select count(*)
        from events e2
        where e2.source_id <> e1.source_id
          and (e1.start_datetime, e1.end_datetime) overlaps (e2.start_datetime, e2.end_datetime)
       )  as overlap_count
from events e1
order by overlap_count desc;

另一种选择是使用range types&&运算符代替overlaps

select e1.source_id, 
       count(distinct e2.source_id) as overlap_count, 
       array_agg(e2.source_id) as overlap_events
from events e1
  join events e2 on e1.source_id <> e2.source_id
             and tsrange(e1.start_datetime, e1.end_datetime,'[]') && tsrange(e2.start_datetime, e2.end_datetime, '[]') 
group by e1.source_id
order by overlap_count desc;

答案 1 :(得分:1)

这是一个想法:计算开始次数并减去停止次数。这给出了每次净额。其余的只是聚合:

with e as (
      select start_datetime as dte, 1 as inc
      from events
      union all
      select end_datetime as dte, -1 as inc
      from events
     )
select max(concurrent)
from (select dte, sum(sum(inc)) over (order by dte) as concurrent
      from e
      group by dte
     ) e;

子查询显示每次重叠事件的数量。

您可以将时间范围设为:

select dte, next_dte, concurrent
from (select dte, sum(sum(inc)) over (order by dte) as concurrent,
             lead(dte) over (partition by dte) as next_dte
      from e
     ) e
order by concurrent desc
fetch first 1 row only;