用于显示用户会话长度的SQL查询

时间:2018-03-08 14:17:51

标签: sql amazon-redshift

我有一个看起来像这样的表:

user_id    page       happened_at
2         'page3'     2017-10-05 11:31    
1         'page2'     2016-02-01 00:02  
2         'page1'     2017-10-05 15:24
3         'page3'     2017-03-31 19:35
4         'page1'     2017-07-09 00:24
2         'page3'     2017-10-05 15:28
1         'page3'     2018-02-01 13:02
2         'page2'     2017-10-05 16:14
2         'page3'     2017-10-05 16:34
etc

我有一个查询用于识别用户会话,这些用户会话按照特定顺序打开的页面#1,#2和#3,在彼此不到一小时的时间段内完成(第2页的一小时内的第3页, page1内一小时内的第2页。可以忽略在它之间打开的任何页面。上表中的会话示例:

user_id    page       happened_at
2         'page1'     2017-10-05 15:24
2         'page2'     2017-10-05 16:14
2         'page3'     2017-10-05 16:34 

到目前为止,我的查询如下所示,并显示了具有会话的用户的user_id:

select  user_id
from (select user_id,page,happened_at,
      lag(page) over(partition by user_id order by happened_at) as prev_page,
      lead(page) over(partition by user_id order by happened_at) as next_page,
      datediff(minute,lag(happened_at) over(partition by user_id order by happened_at),happened_at) as time_diff_with_prev_action,
      datediff(minute,happened_at,lead(happened_at) over(partition by user_id order by happened_at)) as time_diff_with_next_action
      from tbl
     ) t
where page='page2' and prev_page='page1' and next_page='page3'
and time_diff_with_prev_action <= 60 and time_diff_with_next_action <= 60

我需要的是编辑查询,在输出中添加2列,会话开始时间和会话结束时间,这是最后一次操作+ 1小时。请告知如何制作它。临时表是被禁止的,所以它应该只是一个查询。示例输出应为:

user_id    session_start     session_end
2          2017-10-05 15:24  2017-10-05 17:34

谢谢你的时间!

0 个答案:

没有答案