将记录分组到特定条件并查找最大值

时间:2014-01-15 10:10:02

标签: sql oracle oracle11g gaps-and-islands

我有一个veh_speed表,其中包含字段viddate_timespeedstatus。我的目标是获得速度大于30的车辆的持续时间(start_date_timeend_date_time)。目前我正在使用PL/SQL生成报告。是否可以使用SQL。如果可以在范围之间获得max_speed,那将会很棒。

我的表格如下:

VID  START_DATE_TIME        SPEED  STATUS
---  -------------------    -----  ------
1   15/01/2014 10:00:05     0      N
1   15/01/2014 10:00:10    10      Y 
1   15/01/2014 10:00:15    30      Y
1   15/01/2014 10:00:20    35      Y
1   15/01/2014 10:00:25    45      Y
1   15/01/2014 10:00:27    10      Y
1   15/01/2014 10:00:29     0      Y
1   15/01/2014 10:00:30    20      Y
1   15/01/2014 10:00:35    32      Y
1   15/01/2014 10:00:40    33      Y
1   15/01/2014 10:00:45    35      Y
1   15/01/2014 10:00:50    38      Y
1   15/01/2014 10:00:55    10      Y

我想得到以下输出:

VID   START_DATE_TIME          END_DATE_TIME          MAX_SPEED
---   ---------------          -------------          ---------
1    15/01/2014 10:00:15     15/01/2014 10:00:25      45
1    15/01/2014 10:00:35     15/01/2014 10:00:50      38

这是表创建脚本:

CREATE TABLE veh_speed(vid NUMBER(3), 
             date_time DATE, 
             speed NUMBER(3), 
             status CHAR(1));

INSERT ALL
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:05', 'dd/mm/yyyy hh24:mi:ss'),  0,  'N')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:10', 'dd/mm/yyyy hh24:mi:ss'), 10, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:15', 'dd/mm/yyyy hh24:mi:ss'), 30, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:20', 'dd/mm/yyyy hh24:mi:ss'), 35, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:25', 'dd/mm/yyyy hh24:mi:ss'), 45, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:27', 'dd/mm/yyyy hh24:mi:ss'), 10, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:29', 'dd/mm/yyyy hh24:mi:ss'),  0, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:30', 'dd/mm/yyyy hh24:mi:ss'), 20, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:35', 'dd/mm/yyyy hh24:mi:ss'), 32, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:40', 'dd/mm/yyyy hh24:mi:ss'), 33, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:45', 'dd/mm/yyyy hh24:mi:ss'), 35, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:50', 'dd/mm/yyyy hh24:mi:ss'), 38, 'Y')
     INTO veh_speed VALUES(1, to_date('15/01/2014 10:00:55', 'dd/mm/yyyy hh24:mi:ss'), 10, 'Y')
SELECT * FROM dual;

我希望我明白我的问题。

提前致谢。

3 个答案:

答案 0 :(得分:5)

您可以使用分析功能将记录分组为速度为30或更高的块:

select vid, date_time, speed, status,
  case when speed >= 30 then 30 else 0 end as speed_limit,
  row_number() over (partition by vid order by date_time)
    - row_number() over (
      partition by vid, case when speed >= 30 then 30 else 0 end
      order by date_time) as chain
from veh_speed;

      VID DATE_TIME                SPEED STATUS SPEED_LIMIT      CHAIN
---------- ------------------- ---------- ------ ----------- ----------
         1 15/01/2014 10:00:05          0 N                0          0 
         1 15/01/2014 10:00:10         10 Y                0          0 
         1 15/01/2014 10:00:15         30 Y               30          2 
         1 15/01/2014 10:00:20         35 Y               30          2 
         1 15/01/2014 10:00:25         45 Y               30          2 
         1 15/01/2014 10:00:27         10 Y                0          3 
         1 15/01/2014 10:00:29          0 Y                0          3 
         1 15/01/2014 10:00:30         20 Y                0          3 
         1 15/01/2014 10:00:35         32 Y               30          5 
         1 15/01/2014 10:00:40         33 Y               30          5 
         1 15/01/2014 10:00:45         35 Y               30          5 
         1 15/01/2014 10:00:50         38 Y               30          5 
         1 15/01/2014 10:00:55         10 Y                0          7 

我不能赞成使用两个row_number()调用来生成记录链的技巧,不幸的是,我在某个地方选择了它(可能是here)。 chain的实际值无关紧要,只是它们在每个vid中都是唯一的,并且对于符合条件的连续记录块中的所有记录都是相同的。

您只对“速度限制”为30的相关记录链感兴趣(这可能很容易成为Y / N标志或其他),因此您可以使用它并过滤出那些连锁店的速度不到30;然后使用普通的聚合函数来得到你想要的东西:

select vid,
  min(date_time) as start_date_time,
  max(date_time) as end_date_time,
  max(speed) as max_speed
from (
  select vid, date_time, speed, status,
    case when speed >= 30 then 30 else 0 end as speed_limit,
    row_number() over (partition by vid order by date_time)
      - row_number() over (
        partition by vid, case when speed >= 30 then 30 else 0 end
        order by date_time) as chain
  from veh_speed
)
where speed_limit = 30
group by vid, chain
order by vid, start_date_time;

       VID START_DATE_TIME     END_DATE_TIME        MAX_SPEED
---------- ------------------- ------------------- ----------
         1 15/01/2014 10:00:15 15/01/2014 10:00:25         45 
         1 15/01/2014 10:00:35 15/01/2014 10:00:50         38 

SQL Fiddle

答案 1 :(得分:2)

这个问题众所周知,作为群组的开头,你可以google这个。 通用方法是 a)确定不同行满足标准的标准 c)按正确的顺序对它们进行排序 d)为每个时期制作一个组列,以便及时拆分 e)将它们分组。

就像特殊情况的例子一样:

SQL> select vid, min(date_time) start_time, max(date_time) end_time, max(speed) max_speed
  2  from (
  3  select vid, date_time,
  4  date_time - (row_number() over(partition by vid order by date_time))*speed_sign*5/24/3600 group_time, speed_sign, speed
  5  from (
  6  select vid, date_time, decode(sign(speed-30),0,1,sign(speed-30)) speed_sign , speed
  7  from veh_speed order by date_time
  8  )) where speed_sign > 0
  9  group by vid, group_time
 10  /


   VID START_TIME          END_TIME             MAX_SPEED                   

     1 15.01.2014 10:00:15 15.01.2014 10:00:25         45                   
     1 15.01.2014 10:00:35 15.01.2014 10:00:50         38                                             

答案 2 :(得分:1)

我使用子请求来分组(但我想这不像Alex的解释那么清楚):

select z.vid, min(z.date_time) start_time, z.end_time, max(z.speed) max_speed
from
(
  with w as
  (
    select y.vid, y.date_time, y.speed, y.status, y.over_30, y.next_time, decode(y.next_time_over_30, y.next_time, 'N', 'Y') end_of_block
    from
    (
      select x.vid, x.date_time, x.speed, x.status, x.over_30, x.next_time, lead(x.date_time, 1, null) over (partition by x.vid order by x.date_time) next_time_over_30
      from
      (
        select vs.vid, vs.date_time, vs.speed, vs.status, case when vs.speed >= 30 then 'Y' else 'N' end over_30, lead(vs.date_time, 1, null) over (partition by vs.vid order by vs.date_time) next_time
        from veh_speed vs
      ) x 
      where x.over_30 = 'Y'
    ) y
  )
  select w1.vid, w1.date_time, w1.speed, w1.status, w1.over_30, w1.next_time, w1.end_of_block, min(w2.date_time) end_time
  from w w1, w w2
  where w2.end_of_block = 'Y'
    and w2.date_time >= w1.date_time
  group by w1.vid, w1.date_time, w1.speed, w1.status, w1.over_30, w1.next_time, w1.end_of_block
  order by w1.vid, w1.date_time
) z
group by z.vid, z.end_time
;

这给出了:

VID     START_TIME              END_TIME                MAX_SPEED
1       Jan-15-2014 10:00:35    Jan-15-2014 10:00:50    38
1       Jan-15-2014 10:00:15    Jan-15-2014 10:00:25    45