Postgresql的dense_rank操作

时间:2017-03-17 20:22:33

标签: sql postgresql

我有一张这样的表:

     Name     | Time  |  
     Sam      | 10:58 | 
     Sam      | 10:59 | 
     Sam      | 11:10 |   
     Tom      | 1:16  |   
     Tom      | 1:17  |   
     Tom      | 2:10  |  
     Tom      | 3:44  | 
     Tom      | 3:45  | 

基本上,它是一个记录一个人活动并显示活动时间的表格。此表中出现的任何内容都是违法行为,它们通常与少量信息组合在一起。根据经验,如果活动时间间隔不超过3分钟,则认为它们是相同的攻击/违规行为。因此,一个人可以在表格中有多个条目,但它们可能属于同一个违规行为/有多个条目,属于不同的违规行为。

理想情况下,我希望表格看起来像这样:

 Name     | Time  | Infraction Number|  
 Sam      | 10:58 |         1        |
 Sam      | 10:59 |         1        |
 Sam      | 11:10 |         2        |  
 Tom      | 1:16  |         1        |  
 Tom      | 1:17  |         1        |  
 Tom      | 2:10  |         2        | 
 Tom      | 3:44  |         3        | 
 Tom      | 3:45  |         3        | 

无论如何我可以利用dense_rank在postgresql中做这样的事情吗?

2 个答案:

答案 0 :(得分:0)

SELECT Name,
       EXTRACT( HOUR FROM  time1 )||':'||EXTRACT( MINUTE FROM  time1 ) AS             Newtime ,       
       DENSE_RANK() OVER ( PARTITION BY name ORDER BY new_time ) AS Infraction_Number
  FROM
    (
        SELECT name,
               time1,             
               CASE WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 IS NULL OR 
                         ( EXTRACT( EPOCH FROM time1_lead ) - EXTRACT( EPOCH FROM time1 ) )/ 60 = 1 
                    THEN time1
                    WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 = 1
                    THEN time1_lag
                    WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 <> 1
                    THEN time1
                    WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 IS NULL OR
                         ( EXTRACT( EPOCH FROM time1_lead ) - EXTRACT( EPOCH FROM time1 ) )/ 60 = 1
                    THEN time1 
                END AS new_time                        
          FROM
            (
                SELECT name, 
                       time1, 
                       LAG( time1, 1) OVER ( PARTITION BY name ORDER BY time1 ) AS time1_lag,
                       LEAD( time1, 1) OVER ( PARTITION BY name ORDER BY time1 ) AS time1_lead
                  FROM Yourtable
            )  
    )   ;

答案 1 :(得分:0)

此查询标记启动新违规行的行:

select *, coalesce(time > lag(time) over w + 3*'1m'::interval, true)::int mark
from logs
window w as (partition by name order by time);

 name |   time   | mark 
------+----------+------
 Sam  | 10:58:00 |    1
 Sam  | 10:59:00 |    0
 Sam  | 11:10:00 |    1
 Tom  | 01:16:00 |    1
 Tom  | 01:17:00 |    0
 Tom  | 02:10:00 |    1
 Tom  | 03:44:00 |    1
 Tom  | 03:45:00 |    0
(8 rows)

使用这些标记的累计总和来得到你想要的东西:

select name, time, sum(mark) over w as infraction_number
from (
    select *, coalesce(time > lag(time) over w + 3*'1m'::interval, true)::int mark
    from logs
    window w as (partition by name order by time)
    ) s
window w as (partition by name order by time);

 name |   time   | infraction_number 
------+----------+-------------------
 Sam  | 10:58:00 |                 1
 Sam  | 10:59:00 |                 1
 Sam  | 11:10:00 |                 2
 Tom  | 01:16:00 |                 1
 Tom  | 01:17:00 |                 1
 Tom  | 02:10:00 |                 2
 Tom  | 03:44:00 |                 3
 Tom  | 03:45:00 |                 3
(8 rows)

Test it here.