时间范围中间的SQL提取事件

时间:2018-07-26 11:50:12

标签: sql oracle hive

我能够根据以下示例提取事件日志的登录和注销时间:SQL login/logout table

但是我在确定用户登录期间是否访问驱动器C和驱动器D以及访问是否在登录时间和注销时间之间时遇到了一些问题。

+-----+------------------+-------------------+----------+
| S/N |      Event       |     Timestamp     | Username |
+-----+------------------+-------------------+----------+
|   1 | Login            | 26 Jul 2018 19:35 | a        |
|   2 | Login            | 26 Jul 2018 20:00 | b        |
|   3 | Access Drive C   | 26 Jul 2018 20:30 | b        |
|   4 | Access Drive D   | 26 Jul 2018 20:30 | b        |
|   5 | Logout           | 26 Jul 2018 21:00 | b        |
|   6 | Login            | 26 Jul 2018 22:00 | c        |
|   7 | Login            | 26 Jul 2018 22:30 | c        |
|   8 | Access Service C | 26 Jul 2018 22:30 | c        |
|   9 | Logout           | 26 Jul 2018 23:00 | c        |
+-----+------------------+-------------------+----------+

+-----+-------------------+-------------------+----------+----------+---------+---------+
| S/N |       Login       |      Logout       | Username | Duration | Drive C | Drive D |
+-----+-------------------+-------------------+----------+----------+---------+---------+
|   1 | 26 Jul 2018 19:35 | NULL              | a        | NULL     | NULL    | NULL    |
|   2 | 26 Jul 2018 20:00 | 26 Jul 2018 21:00 | b        | 10min    | Y       | Y       |
|   3 | 26 Jul 2018 20:00 | 26 Jul 2018 20:30 | c        | 30min    | Y       | N       |
+-----+-------------------+-------------------+----------+----------+---------+---------+

1 个答案:

答案 0 :(得分:1)

大概用户可以登录/注销多次。您只在给定访问期限内寻找访问权限吗?

如果您使用的是Oracle Database 12c,则可以使用模式匹配来执行此操作。这使您可以使用正则表达式搜索数据。

login+ ( drive_c | drive_d )* logout{0,1}

手段:

  • 查找一个或多个登录事件
  • 驱动器C或D访问之后,零次或多次
  • 最后有一个可选的登出

我不知道您从哪里获得了用户B的10分钟持续时间和用户C的30分钟持续时间。因此,我显示了从第一次登录到注销(如果已设置)的总时间。这是什么:

last ( logout.ts ) - first ( login.ts )

如果需要其他功能,请在ts之前更改变量名称。

要检查他们是否访问过C或D,您需要查看这些模式变量是否匹配。返回此变量的计数是执行此操作的一种方法。

CREATE TABLE t (
  SN int, Event varchar(16), ts date, Username varchar(1)
);
alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';
INSERT INTO t VALUES  (1, 'Login', '2018-07-26 19:35:00', 'a');
insert into t values  (2, 'Login', '2018-07-26 20:00:00', 'b');
insert into t values  (3, 'Access Drive C', '2018-07-26 20:30:00', 'b');
insert into t values  (4, 'Access Drive D', '2018-07-26 20:30:00', 'b');
insert into t values  (5, 'Logout', '2018-07-26 21:00:00', 'b');
insert into t values  (6, 'Login', '2018-07-26 22:00:00', 'c');
insert into t values  (7, 'Login', '2018-07-26 22:30:00', 'c');
insert into t values  (8, 'Access Service C', '2018-07-26 22:30:00', 'c');
insert into t values  (9, 'Logout', '2018-07-26 23:00:00', 'c');

select * from t
match_recognize (
  partition by username
  order by ts
  measures
    first ( login.ts ) as login,
    last ( logout.ts ) as logout,
    round ( ( last ( logout.ts ) - first ( login.ts ) ) * 1440 ) as duration_minutes,
    case when count ( drive_c.ts ) > 0 then 'Y' else 'N' end as access_c,
    case when count ( drive_d.ts ) > 0 then 'Y' else 'N' end as access_d
  pattern ( login+ ( drive_c | drive_d )*  logout{0,1} )
  define 
    login as event = 'Login',
    drive_c as event like 'Access%C',
    drive_d as event like 'Access%D',
    logout as event = 'Logout'
);

U LOGIN               LOGOUT              DURATION_MINUTES A A
- ------------------- ------------------- ---------------- - -
a 2018-07-26 19:35:00                                      N N
b 2018-07-26 20:00:00 2018-07-26 21:00:00               60 Y Y
c 2018-07-26 22:00:00 2018-07-26 23:00:00               60 Y N

如果您需要在旧版本上工作,可以执行以下操作:

  • 使用first_value查找每次登录后下一次注销的时间。确保您忽略空值!
  • 对C和D的访问执行相同的操作
  • 按用户名和下一个注销值分组,以查找每个用户的第一个登录名
  • 检查您使用first_value为C&D找到的时间戳值是否在登录/注销时间之间,并进行相应报告

外观如下:

with next_logouts as (
  select t.*,
         first_value (
           case when event = 'Logout' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) next_logout,
         first_value (
           case when event like 'Access%C' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) access_c,
         first_value (
           case when event like 'Access%D' then ts end
         ) ignore nulls over (
           partition by username
           order  by ts rows between current row and unbounded following 
         ) access_d
  from   t
), periods as (
  select username, min ( ts ) login, next_logout logout,
         case
           when min ( access_c ) > min ( ts ) and min ( access_c ) < next_logout then 'Y'
           else 'N'
         end access_c,
         case
           when min ( access_d ) > min ( ts ) and min ( access_d ) < next_logout then 'Y'
           else 'N'
         end access_d
  from   next_logouts n
  group  by username, next_logout
)
  select p.*
  from   periods p
  order  by username, login;

USERNAME   LOGIN                 LOGOUT                ACCESS_C   ACCESS_D   
a          2018-07-26 19:35:00   <null>                N          N          
b          2018-07-26 20:00:00   2018-07-26 21:00:00   Y          Y          
c          2018-07-26 22:00:00   2018-07-26 23:00:00   Y          N