我有一张这样的表:
Name | Time |
Sam | 10:58 |
Sam | 10:59 |
Sam | 11:10 |
Tom | 1:16 |
Tom | 1:17 |
Tom | 2:10 |
Tom | 3:44 |
Tom | 3:45 |
基本上,它是一个记录一个人活动并显示活动时间的表格。此表中出现的任何内容都是违法行为,它们通常与少量信息组合在一起。根据经验,如果活动时间间隔不超过3分钟,则认为它们是相同的攻击/违规行为。因此,一个人可以在表格中有多个条目,但它们可能属于同一个违规行为/有多个条目,属于不同的违规行为。
理想情况下,我希望表格看起来像这样:
Name | Time | Infraction Number|
Sam | 10:58 | 1 |
Sam | 10:59 | 1 |
Sam | 11:10 | 2 |
Tom | 1:16 | 1 |
Tom | 1:17 | 1 |
Tom | 2:10 | 2 |
Tom | 3:44 | 3 |
Tom | 3:45 | 3 |
无论如何我可以利用dense_rank在postgresql中做这样的事情吗?
答案 0 :(得分:0)
SELECT Name,
EXTRACT( HOUR FROM time1 )||':'||EXTRACT( MINUTE FROM time1 ) AS Newtime ,
DENSE_RANK() OVER ( PARTITION BY name ORDER BY new_time ) AS Infraction_Number
FROM
(
SELECT name,
time1,
CASE WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 IS NULL OR
( EXTRACT( EPOCH FROM time1_lead ) - EXTRACT( EPOCH FROM time1 ) )/ 60 = 1
THEN time1
WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 = 1
THEN time1_lag
WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 <> 1
THEN time1
WHEN ( EXTRACT( EPOCH FROM time1 ) - EXTRACT( EPOCH FROM time1_lag ) )/ 60 IS NULL OR
( EXTRACT( EPOCH FROM time1_lead ) - EXTRACT( EPOCH FROM time1 ) )/ 60 = 1
THEN time1
END AS new_time
FROM
(
SELECT name,
time1,
LAG( time1, 1) OVER ( PARTITION BY name ORDER BY time1 ) AS time1_lag,
LEAD( time1, 1) OVER ( PARTITION BY name ORDER BY time1 ) AS time1_lead
FROM Yourtable
)
) ;
答案 1 :(得分:0)
此查询标记启动新违规行的行:
select *, coalesce(time > lag(time) over w + 3*'1m'::interval, true)::int mark
from logs
window w as (partition by name order by time);
name | time | mark
------+----------+------
Sam | 10:58:00 | 1
Sam | 10:59:00 | 0
Sam | 11:10:00 | 1
Tom | 01:16:00 | 1
Tom | 01:17:00 | 0
Tom | 02:10:00 | 1
Tom | 03:44:00 | 1
Tom | 03:45:00 | 0
(8 rows)
使用这些标记的累计总和来得到你想要的东西:
select name, time, sum(mark) over w as infraction_number
from (
select *, coalesce(time > lag(time) over w + 3*'1m'::interval, true)::int mark
from logs
window w as (partition by name order by time)
) s
window w as (partition by name order by time);
name | time | infraction_number
------+----------+-------------------
Sam | 10:58:00 | 1
Sam | 10:59:00 | 1
Sam | 11:10:00 | 2
Tom | 01:16:00 | 1
Tom | 01:17:00 | 1
Tom | 02:10:00 | 2
Tom | 03:44:00 | 3
Tom | 03:45:00 | 3
(8 rows)