我有一个包含session_id,user_id,start_time和value
的表从技术上讲,用户应该每隔30分钟获得一个新的session_id,因此绝不应该有2个条目具有相同的user_id,但它们的开始时间在30分钟之内。
如何运行查询以查找这些错误情况?
我做了类似的事情来查看给定用户的条目的一些时间差异:
select t1.start_time - t2.start_time
from user_sessions as t1 inner join
user_sesssions as t2
on t1.user_id = 1 and t2.user_id = 1
我知道我正在寻找以下情况:
((t1.start_time-t2.start_time) < 60*30*1000000 and (t1.start_time-t2.start_time) > 0) and t1.user_id = t2.user_id
我只是不确定如何将这两个部分组合成一个查询。
答案 0 :(得分:0)
这样做你想要的吗?
select t1.start_time - t2.start_time
from user_sessions t1 inner join
user_sesssions t2
on t1.user_id = t2.user_id
where (t1.start_time - t2.start_time) < 60*30*1000000 and
(t1.start_time - t2.start_time) > 0;
答案 1 :(得分:0)
使用LAG() OVER()
可以通过一种简单的方法计算行之间的时差:
SELECT
user_id, previous_start, start_time, minutes_diff
FROM (
SELECT
user_id
, LAG(start_time) OVER(PARTITION BY user_id ORDER BY start_time) previous_start
, EXTRACT(MINUTES FROM
start_time - lag(start_time) over(partition by user_id order by start_time)
) minutes_diff
FROM user_sessions
) d
WHERE minutes_diff < 30
;