Oracle SQL: Exclude Rows Where TimeStamps are within minutes of each other

时间:2017-06-15 09:54:12

标签: sql oracle

So I have a table of transactions. I need to exclude any transactions that are within 15 minutes of the previous transaction for the same USER ID.

EXAMPLE

USERID          TRANS_TIME  
----------------------------------------  
00000001    24-FEB-17 15.13.51.713000000
00000001    16-MAR-17 10.10.20.781000000
00000001    16-MAR-17 10.10.32.659000000
00000001    16-MAR-17 10.13.04.070000000
00000001    16-MAR-17 10.13.49.339000000
00000001    16-MAR-17 10.22.33.467000000
00000001    16-MAR-17 10.23.09.755000000
00000001    16-MAR-17 10.25.51.994000000
00000001    16-MAR-17 10.26.08.130000000
00000001    29-MAR-17 10.23.01.665000000

So I would end up with 4 rows.

USER ID         TRANS_TIME  
----------------------------------------  
00000001    24-FEB-17 15.13.51.713000000
00000001    16-MAR-17 10.10.20.781000000
00000001    16-MAR-17 10.25.51.994000000
00000001    29-MAR-17 10.23.01.665000000

Any ideas or tips on how to code for this? Ideally without creating a function or a procedure.

Cheers.

3 个答案:

答案 0 :(得分:1)

Just use lag():

select t.*
from (select t.*,
             lag(trans_time) over (partition by userid order by trans_time) as prev_tt
      from t
     ) t
where prev_tt is null or
      trans_time > prev_tt + (15 / (24 * 60));

Note: You can write the where using interval notation instead (that is actually a better approach):

where prev_tt is null or
      trans_time > prev_tt + interval '15' minute;

答案 1 :(得分:1)

按如下方式解释您所需的逻辑:

单独为每个userid包括具有最早交易时间的行。然后,对于每一行,查看它是否在最近包含的行的15分钟(< =)内,如果是,则排除此"当前"你要检查的那一行。如果新行在最近包含的行的15分钟内,则包括此新行。

换句话说,有15分钟的会话。如果一个行尚未在另一行打开的会话中,则会打开一个新会话。在这种安排中,如您所需的输出所示,仅将行与前一行进行比较是不够的。

使用Oracle 12.1及更高版本中的MATCH_RECOGNIZE子句可以非常轻松地解决此问题。唉,这在Oracle 11或更早版本中不可用。

with
     test_data ( userid, trans_time ) as (
       select '00000001', to_timestamp('24-FEB-17 15.13.51.713000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.20.781000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.10.32.659000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.04.070000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.13.49.339000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.22.33.467000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.23.09.755000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.25.51.994000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('16-MAR-17 10.26.08.130000000', 'dd-MON-yy hh24.mi.ss.ff') from dual union all
       select '00000001', to_timestamp('29-MAR-17 10.23.01.665000000', 'dd-MON-yy hh24.mi.ss.ff') from dual
     )
-- End of test data (not part of the solution). SQL query begins below this line.
select userid, session_start as trans_time
from   test_data
match_recognize (
  partition by userid
  order by     trans_time
  measures     a.trans_time as session_start
  pattern      ( a b* )
  define       b as b.trans_time <= a.trans_time + interval '15' minute
)
order by userid, trans_time    --   if needed
;

USERID    TRANS_TIME             
--------  ------------------------------
00000001  24-FEB-2017 15.13.51.713000000
00000001  16-MAR-2017 10.10.20.781000000
00000001  16-MAR-2017 10.25.51.994000000
00000001  29-MAR-2017 10.23.01.665000000

答案 2 :(得分:1)

我在其他答案中使用相同的假设(使用MATCH_RECOGNIZE子句),这是另一种解决问题的方法。

此解决方案使用递归子查询因子(递归CTE),因此可以在Oracle 11.2中使用(但不幸的是,不在早期版本中)。

with
-- Begin test data (not part of the solution)
     test_data ( userid, trans_time ) as (
       [     select ......    SAME AS IN THE OTHER ANSWER     ]
     ),
-- End of test data (not part of the solution). SQL query begins below this line.
     prep ( userid, trans_time, rn ) as (
       select userid, trans_time, 
              row_number() over (partition by userid order by trans_time)
       from   test_data
     ),
     rec ( userid, trans_time, rn, session_start ) as (
       select     userid, min(trans_time), 1, min(trans_time)
         from     prep
         group by userid
       union all
         select   p.userid, p.trans_time, p.rn,
                  case when p.trans_time > r.session_start + interval '15' minute
                       then p.trans_time
                       else r.session_start
                  end
         from     prep p join rec r on p.userid = r.userid and p.rn = r.rn + 1
     )
select   distinct userid, trans_time
from     rec
where    trans_time = session_start
order by userid, trans_time       --   if needed
;