分组Web会话

时间:2017-03-15 18:46:26

标签: sql sql-server tsql

目标是通过所述记录之间的超时将用户动作的跨国信息分组到会话中。 (例如:相同的用户在每个1-3分钟内执行10个操作 - >会话#1;再次执行另外10个操作后的2个小时,其间只需几分钟 - >会话#2)

示例输入:

id    user_id    trans_datetime
1     1          2017-03-16 07:12:01
2     2          2017-03-16 07:12:02
3     2          2017-03-16 07:12:12
4     1          2017-03-16 08:57:00
5     1          2017-03-16 08:58:01
6     1          2017-03-16 09:01:50
7     1          2017-03-16 10:14:01
8     1          2017-03-16 10:18:01
9     1          2017-03-16 10:35:11

预期产出:

id  start_id user_id    trans_datetime
1   1        1          2017-03-16 07:12:01
2   2        2          2017-03-16 07:12:02
3   2        2          2017-03-16 07:12:12
4   4        1          2017-03-16 08:57:00
5   4        1          2017-03-16 08:58:01
6   4        1          2017-03-16 09:01:50
7   7        1          2017-03-16 10:14:01
8   7        1          2017-03-16 10:18:01
9   7        1          2017-03-16 10:35:11

我最初的想法是使用递归CTE:

With rCTE as (
 Select id 
   ,id as start_id
   ,user_id
   ,tran_datetime
 from transactions
 where first_transaction_flg = 1

Union all

Select child.id
   ,parent.id as start_id
   ,child.user_id
   ,child.tran_datetime
from transactions child
 Inner Join rCTE parent
 on child.user_id = parent.user_id
  and child.tran_datetime > parent.datetime
  and datediff(minute, child.tran_datetime, parent.tran_datetime) < 20
)
Select * from rCTE

但它似乎没有按预期工作,我无法完全理解为什么。

1 个答案:

答案 0 :(得分:1)

使用带有子查询的common table expression来检查每个trans_datetime是否存在有效的先前活动,以及outer apply()

;with ses as (
  select 
      t.*
    , prevTime = (
        select max(i.trans_datetime) 
        from t as i
        where i.user_id = t.user_id
          and i.trans_datetime < t.trans_datetime
          and i.trans_datetime >= dateadd(hour,-1,t.trans_datetime)
        )
  from t
)
select 
    id
  , start_id = case 
      when prevTime is null 
        then id 
      else x.start_id
      end
  , user_id
  , trans_datetime
from ses
outer apply (
  select top 1
    start_id = id
  from ses i
  where i.user_id = ses.user_id 
    and i.trans_datetime < ses.trans_datetime
    and i.prevTime is null
  order by trans_datetime desc
    ) x

rextester演示:http://rextester.com/CGJSX81463

返回:

+----+----------+---------+---------------------+
| id | start_id | user_id |   trans_datetime    |
+----+----------+---------+---------------------+
|  1 |        1 |       1 | 2017-03-16 07:12:01 |
|  2 |        2 |       2 | 2017-03-16 07:12:02 |
|  3 |        2 |       2 | 2017-03-16 07:12:12 |
|  4 |        4 |       1 | 2017-03-16 08:57:00 |
|  5 |        4 |       1 | 2017-03-16 08:58:01 |
|  6 |        4 |       1 | 2017-03-16 09:01:50 |
|  7 |        7 |       1 | 2017-03-16 10:14:01 |
|  8 |        7 |       1 | 2017-03-16 10:18:01 |
|  9 |        7 |       1 | 2017-03-16 10:35:11 |
+----+----------+---------+---------------------+