使用分区格式化表

时间:2017-04-07 13:58:40

标签: sql postgresql

只有在PostgreSQL中打开和关闭事件之间获取行的最佳方法是什么?

+------------+----+------------+---------------------+
|  event_id  | ID | occurrence |      datetime       |
+------------+----+------------+---------------------+
| 1003603017 | A  | owner_from | 12/16/2016 4:44:16  |
| 1003603017 | A  | owner_to   | 12/16/2016 4:44:38  |
| 1003603017 | A  | owner_from | 12/16/2016 4:44:38  |
| 1003603017 | A  | opened     | 12/16/2016 4:44:39  |
| 1003603017 | B  | owner_from | 12/16/2016 7:36:23  |
| 1003603017 | A  | owner_to   | 12/16/2016 7:36:23  |
| 1003603017 | B  | owner_to   | 12/16/2016 9:00:01  |
| 1003603017 | C  | owner_from | 12/16/2016 9:00:01  |
| 1003603017 | A  | closed     | 12/16/2016 12:00:36 |
| 1003603017 | D  | owner_from | 12/17/2016 4:25:00  |
| 1003603017 | C  | owner_to   | 12/17/2016 4:25:00  |
| 1003603017 | D  | owner_from | 12/17/2016 4:52:02  |
| 1003603017 | D  | owner_to   | 12/17/2016 4:52:02  |
| 1003603017 | D  | opened     | 12/17/2016 4:52:02  |
| 1003603017 | D  | owner_to   | 12/17/2016 8:57:00  |
| 1003603017 | E  | owner_from | 12/17/2016 8:57:00  |
| 1003603017 | D  | closed     | 12/17/2016 12:03:10 |
+------------+----+------------+---------------------+

5 个答案:

答案 0 :(得分:0)

使用滞后ignore nulls选项会很容易。以下是使用datetimeopened的累计最大值closed的另一种方法:

select t.*
from (select t.*,
             max(case when occurrence = 'opened' then datetime end) over (order by datetime) as mr_opened,
             max(case when occurrence = 'closed' then datetime end) over (order by datetime) as mr_closed,
      from t
     ) t
where mr_opened > mr_closed;

注意:

  • 这不包括最终closed。关于这一要求的问题并不明确。
  • 您可能希望按event_id进行分区。关于这一要求的问题并不明确。
  • 在Postgres的更新版本中,您可以使用filter语法代替case

答案 1 :(得分:0)

您可以使用以下查询:

SELECT event_id, ID, occurrence, datetime       
FROM (
   SELECT event_id, ID, occurrence, datetime,       
          COUNT(CASE WHEN occurrence = 'opened' THEN 1 END) 
             OVER (PARTITION BY datetime) AS grp,
          COUNT(CASE WHEN occurrence = 'opened' THEN 1 END) 
             OVER (ORDER BY datetime)  - 
          COUNT(CASE WHEN occurrence = 'closed' THEN 1 END) 
             OVER (ORDER BY datetime) AS slice
   FROM mytable) AS t
WHERE t.slice = 1 AND grp <> 1 ;  

使用的第一个窗口条件COUNT有助于识别与opened个符号重合的opened条记录。使用此字段的值,我们可以过滤掉这些记录。

第二个计算字段使用openedclosed种群中的运行总计之间的差异来确定以open开头并在closed记录之前终止的分区

Demo here

答案 2 :(得分:0)

我自己选择了一个内部联接,只选择已打开和关闭的事件,并在关闭时加上一个潜在客户并加入原始表中的日期时间在这些值之间。

答案 3 :(得分:0)

这个可以使用简单的JOIN来完成(但它需要多个):

select e.*
from   events o
join   events c on c.datetime > o.datetime
join   events e on e.datetime > o.datetime and e.datetime < c.datetime
where  o.occurrence = 'opened'
and    c.occurrence = 'closed'
and    not exists(select 1
                  from   events x
                  where  x.datetime > o.datetime
                  and    x.datetime < c.datetime
                  and    x.occurrence in ('opened', 'closed'));

当您向表中添加更多行时,这可能不会很好地扩展,但它可以使用索引。

另一种选择是使用窗口函数:

select (e).*
from   (select e, count(1) filter (where occurrence = 'opened') over (order by datetime rows between unbounded preceding and 1 preceding)
                - count(1) filter (where occurrence = 'closed') over (order by datetime rows between unbounded preceding and current row) open_close
        from   events e) e
where  open_close = 1;

这个将始终需要全表扫描(但只有一个)。另一个区别是,如果您没有关闭occurrence = 'closed'行,则窗口变体将返回最后一行occurrence = 'opened'之后的最后一行。

http://rextester.com/FZLV77431

答案 4 :(得分:0)

使用的查询:

select * from t
inner join (select event_id, 
                   valid_from, 
                   valid_to 
            from (select event_id, 
                         id, 
                         occurrence, 
                         lead(occurrence) over (partition by event_id order by datetime) as next_occurrence, 
                         datetime as valid_from, 
                         lead(datetime) over (partition by event_id order by datetime) as valid_to 
                         from t 
                         where occurrence in ('opened','closed')) A 
           where occurrence = 'opened') t1 on t.event_id = t1.event_id and t1.valid_from <= t.datetetime and t.datetime <= t1.valid_to