SQL查询帮助:如何只选择组的开始和结束行(Oracle)?

时间:2010-03-17 14:34:22

标签: sql oracle

(对于这个问题的标题道歉 - 我不太确定如何解释它)

不确定这是否可以在SQL中完成。 下面是事件日志表的一个(有点截断的)样本。

EVENT      ID          DATE      TIME
---------  ----------  --------  ----
ONE_THING  0006241800  20091109  1719
ONE_THING  0006944800  20091109  1720
ANOTHER    0007517110  20091109  1721
ANOTHER    0007214240  20091109  1721
ANOTHER    0006907900  20091109  1725
ANOTHER    0006501580  20091109  1727
ONE_THING  0006944800  20091109  1737
ANOTHER    0005749820  20091109  1737
ANOTHER    0006810500  20091109  1738
ANOTHER    0007481970  20091109  1738
ANOTHER    0006331740  20091109  1739
ANOTHER    0007253840  20091109  1739
ANOTHER    0006929280  20091109  1747
ANOTHER    0007297950  20091109  1749
ANOTHER    0005055560  20091109  1751
ANOTHER    0006092320  20091109  1751
ONE_THING  0001668720  20091109  1753
ONE_THING  0007218000  20091109  1754

我正在寻找另一组实例,其中组中没有发生其他事件,且时间超过2分钟。

因此,在上面的数据集中,第一组是:

ANOTHER    0007517110  20091109  1721
ANOTHER    0007214240  20091109  1721
ANOTHER    0006907900  20091109  1725
ANOTHER    0006501580  20091109  1727

,第二个是:

ANOTHER    0005749820  20091109  1737
ANOTHER    0006810500  20091109  1738
ANOTHER    0007481970  20091109  1738
ANOTHER    0006331740  20091109  1739
ANOTHER    0007253840  20091109  1739
ANOTHER    0006929280  20091109  1747
ANOTHER    0007297950  20091109  1749
ANOTHER    0005055560  20091109  1751
ANOTHER    0006092320  20091109  1751

理想情况下,我想得到:

ANOTHER    0007517110  20091109  1721
ANOTHER    0006501580  20091109  1727

和:

ANOTHER    0005749820  20091109  1737
ANOTHER    0006092320  20091109  1751

甚至更好:

EVENT      DATE      TIME_START  TIME_END
---------  --------  ----------  --------
ANOTHER    20091109  1721        1727
ANOTHER    20091109  1737        1751

我考虑过比较行,但也许有更好的方法?我很感激任何提示。解决方案只需要工作 - 它不一定是花哨或优雅的。

PS>我正在使用Oracle。

5 个答案:

答案 0 :(得分:1)

这应该有效:

SQL> SELECT event, MIN(dt), MAX(dt) FROM (
  2     SELECT event, dt,
  3            SUM(discontinuity) over(ORDER BY dt, event) continuous_group
  4       FROM (SELECT event, dt,
  5                     CASE
  6                        WHEN lag(event) over(ORDER BY dt, event) = event THEN
  7                         0
  8                        ELSE
  9                         1
 10                     END discontinuity
 11                FROM DATA)
 12     )
 13   WHERE event = 'ANOTHER'
 14  GROUP BY event, continuous_group;

EVENT     MIN(DT)       MAX(DT)
--------- ------------- -------------
ANOTHER   20091109 1738 20091109 1751
ANOTHER   20091109 1721 20091109 1737

注意:17:37的事件是同步的,我的查询将ANOTHER事件任意地放入第一组。您可以使用分析函数的ORDER BY子句控制此行为。

答案 1 :(得分:1)

这扩展了Vincent的答案,包括要求该小组必须至少有2分钟的时间:

select event, tm_start, tm_stop
from (select event, min(when) tm_start, max(when) tm_stop
      from (select event,
                   when,
                   sum(discontinuity) over(order by when, event) continuous_group
              from (select event,
                           when,
                           case
                             when lag(event)
                              over(order by when, event) = event then
                              0
                             else
                              1
                           end discontinuity
                      from temp_stack ts))
     where event = 'ANOTHER'
     group by event, continuous_group)
where tm_stop - numtodsinterval(2, 'MINUTE') > tm_start;

答案 2 :(得分:0)

SELECT  *
FROM    (
        SELECT  m.*, LEAD(event) OVER (ORDER BY date, time) AS ne, LAG(event) OVER (ORDER BY date, time) AS pe
        FROM    mytable m
        )
WHERE   event = 'ANOTHER'
        AND (ne <> event OR pe <> event)

答案 3 :(得分:0)

这应该让你开始。请注意,如果您有多个具有相同日期和时间的事件(就像您在样本中所做的那样),则这是不确定的。如果您认为有意义,可以将ID添加到ORDER BY子句作为决胜局。

SELECT * FROM (
SELECT event, id, date, time,
       lag(event) over (order by date, time) previous_event,
       lead(event) over (order by date, time) next_event
)
WHERE event='ANOTHER'
  AND ( event <> previous_event OR event <> next_event )
ORDER BY date, time

答案 4 :(得分:0)

可能有点晚了:)

SELECT
  event,
  min(dt) as dt_begin, max(dt) as dt_end
FROM 
(
select
  t.*,
  row_number()over(order by dt,rownum) -
  row_number()over(partition by event order by dt,rownum) as group_id
from vvp_tmp t
--order by dt
)
GROUP BY group_id,event
HAVING 24*60*(max(dt)-min(dt))>=2
ORDER BY dt_begin