我的Spark DataFrame包含以下数据:
user_id | id | timestamp
---------|----|-------------------
123 | 2 | 2018-10-12 9:25:30
123 | 3 | 2018-10-12 9:27:20
123 | 4 | 2018-10-12 9:45:15
123 | 5 | 2018-10-12 9:47:40
234 | 6 | 2018-10-12 9:26:32
234 | 7 | 2018-10-12 9:28:21
234 | 8 | 2018-10-12 9:46:16
234 | 9 | 2018-10-12 9:48:43
我需要计算每个用户的时间间隔小于15分钟的记录数。结果应如下所示:
user_id | count | window
---------|-------|----------------------------------------
123 | 2 | 2018-10-12 9:25:30 - 2018-10-12 9:27:20
123 | 2 | 2018-10-12 9:45:15 - 2018-10-12 9:47:40
234 | 2 | 2018-10-12 9:26:32 - 2018-10-12 9:28:21
234 | 2 | 2018-10-12 9:46:16 - 2018-10-12 9:48:43