PostgreSQL:根据时间间隔对列进行分组

时间:2020-06-22 12:38:03

标签: sql postgresql aggregate

我有下表:

SELECT * FROM trajectories
LIMIT 10;
user_id |    session_id     |       timestamp        |    lat    |    lon     | alt  
---------+-------------------+------------------------+-----------+------------+------
      11 | 10020071017220238 | 2007-10-18 02:51:38+01 | 37.780927 | 113.677553 | 2160
      11 | 10020071017220238 | 2007-10-18 02:51:39+01 |  37.78093 | 113.677627 | 2160
      11 | 10020071017220238 | 2007-10-18 02:51:40+01 | 37.780932 | 113.677698 | 2160
      11 | 10020071017220238 | 2007-10-18 02:51:41+01 | 37.780938 | 113.677772 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:42+01 | 37.780945 | 113.677845 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:43+01 | 37.780952 | 113.677918 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:44+01 | 37.780962 |  113.67799 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:45+01 | 37.780973 |  113.67806 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:46+01 |  37.78098 | 113.678128 | 2159
      11 | 10020071017220238 | 2007-10-18 02:51:47+01 | 37.780992 | 113.678192 | 2157
(10 rows)

SELECT * FROM labels
WHERE travel mode = 'subway'
LIMIT 10
user_id | session_id |    start_timestamp     |     end_timestamp      | travelmode 
---------+------------+------------------------+------------------------+------------
      11 |          0 | 2008-06-18 04:46:10+01 | 2008-06-18 04:54:59+01 | subway
      11 |          0 | 2008-08-01 02:51:47+01 | 2008-08-01 03:37:43+01 | subway
      11 |          0 | 2008-08-01 03:59:36+01 | 2008-08-01 04:30:20+01 | subway
      11 |          0 | 2008-09-16 00:58:43+01 | 2008-09-16 01:07:14+01 | subway
      11 |          0 | 2008-09-16 11:49:05+01 | 2008-09-16 12:03:05+01 | subway
      11 |          0 | 2008-09-18 00:41:41+01 | 2008-09-18 00:50:43+01 | subway
      11 |          0 | 2008-09-18 10:43:23+01 | 2008-09-18 10:53:03+01 | subway
      11 |          0 | 2008-09-19 10:46:56+01 | 2008-09-19 10:56:10+01 | subway
      11 |          0 | 2008-09-21 23:58:45+01 | 2008-09-22 00:07:41+01 | subway
      11 |          0 | 2008-09-22 11:14:52+01 | 2008-09-22 11:24:30+01 | subway
(10 rows)

几乎有5M个标记为行驶模式的点:

SELECT COUNT(*) 
FROM trajectories t
JOIN labels l
 ON t.user_id = l.user_id
WHERE t.timestamp >= l.start_timestamp AND t.timestamp <= l.end_timestamp
 count  
---------
 4931303
(1 row)

但是我想根据间隔(在subway表中)来了解trajectories模式的捕获率,即在1-5 seconds, 5-10 seconds, 10- 20seconds到{{1}之间的点有多少点}

1 个答案:

答案 0 :(得分:2)

您可以使用条件聚合:

SELECT COUNT(*),
       COUNT(*) FILTER (WHERE end_timestamp >= start_timestamp AND end_timestamp < start_timestamp + interval '5 second'),
       COUNT(*) FILTER (WHERE end_timestamp >= start_timestamp + interval '5 second' AND end_timestamp < start_timestamp + interval '10 second'),
       . . .
FROM trajectories t JOIN
     labels l
     ON t.user_id = l.user_id
WHERE t.timestamp >= l.start_timestamp AND t.timestamp <= l.end_timestamp