使用以下示例数据,我尝试将这些记录分为三组,给定“休息”时间。
ID | 纬度 | 长 | 时间戳 |
---|---|---|---|
1 | 80.1 | -120.2 | 2021-03-01 01:00:00 |
2 | 80.1 | -120.2 | 2021-03-01 01:01:00 |
3 | 80.1 | -120.2 | 2021-03-01 01:02:00 |
4 | 80.1 | -120.2 | 2021-03-01 01:03:00 |
5 | 80.1 | -120.2 | 2021-03-01 01:04:00 |
6 | 80.1 | -120.2 | 2021-03-01 01:15:00 |
7 | 80.1 | -120.2 | 2021-03-01 01:16:00 |
8 | 80.1 | -120.2 | 2021-03-01 01:17:00 |
9 | 80.1 | -120.2 | 2021-03-01 01:18:00 |
10 | 80.1 | -120.2 | 2021-03-01 02:10:00 |
11 | 80.1 | -120.2 | 2021-03-01 02:11:00 |
12 | 80.1 | -120.2 | 2021-03-01 02:12:00 |
13 | 80.1 | -120.2 | 2021-03-01 02:13:00 |
14 | 80.1 | -120.2 | 2021-03-01 02:14:00 |
因此,如果空闲间隔为 5 分钟或更长时间,我如何将这些记录分为 3 组?第一组是记录 1-5,第二组是记录 6 - 9,第三组是记录 10 - 14,因为记录 5 和 6、9 和 10 之间有超过 5 分钟的中断.
答案 0 :(得分:1)
您可以使用 lag()
和累积总和:
select t.*,
sum( case when prev_timestamp < timestamp - interval 5 minute or
prev_timestamp <> prev_timestamp_ll
then 1
else 0
end ) over (order by timestamp) as grp
from (select t.*,
lag(timestamp) over (partition by lat, lng order by timestamp) as prev_timestamp_ll,
lag(timestamp) over (order by timestamp) as prev_timestamp
from t
) t;
Here 是一个 db<>fiddle。