我有一个表 user_work_details ,其中有两列:USER_ID,START_TIME
START_TIME以毫秒(纪元)为单位,因此在实际时间写下
USER_ID START_TIME
-----------------------------
1 1518210035904 Feb 9, 2018 9:00:35 PM
1 1518307236904 Feb 9, 2018 9:00:35 PM
1 1519048475905 Feb 19, 2018 1:54:35 PM
2 1518400835906 Feb 12, 2018 2:00:35 AM
2 1518400837906 Feb 9, 2018 9:00:37 AM
3 1518494435907 Feb 13, 2018 4:00:35 AM
我需要根据START_TIME值的差异对记录进行分组。所有记录将根据5分钟差异进行分组。所以,输出应该是:
USER_ID START_TIME DIFF
--------------------------------------
1 1518210035904 0
1 1518307236904 0
1 1519048475905 1
2 1518400835906 2
2 1518400837906 2
3 1518494435907 3
如果USER_ID相同或两次之间的差异小于5分钟,则DIFF将具有相同的值。此外,每次更改都需要增加DIFF。
我使用像这样的LAG()尝试了上述内容:
SELECT
"USER_ID",
"START_TIME",
CASE WHEN "START_TIME" - LAG("START_TIME", 1, "START_TIME") OVER
(PARTITION BY "USER_ID" ORDER BY "START_TIME") > 60000
THEN 1
ELSE 0
END AS DIFF
FROM "user_work_details"
order by "USER_ID", "START_TIME"
此查询返回以下输出:
USER_ID START_TIME DIFF
--------------------------------------
1 1518210035904 0
1 1518307236904 1
1 1519048475905 1
2 1518400835906 0
2 1518400837906 1
3 1518494435907 1
我只需要在更改时增加DIFF,某种手动计数器增量。我该怎么办?
编辑:输出值已修复,错误值较早
答案 0 :(得分:0)
您可以使用lag()
定义组的开始时间,然后使用累计和来分配差异:
SELECT uwd.*, SUM(flag) OVER (PARTITION BY user_id ORDER BY start_time) as diff
FROM (SELECT uwd.*
((START_TIME - LAG(START_TIME, 1, START_TIME) OVER (PARTITION BY USER_ID ORDER BY START_TIME) > 60000)::int) as flag
FROM user_work_details uwd
) uwd;
我建议你定义没有双引号的列。必须引用名称只会使查询更难写和阅读。