oracle组按连续范围记录行

时间:2016-06-08 14:43:50

标签: oracle date

我需要根据连续的工作日期范围对一天的行进行分组和汇总。

表出勤率定义:

row_no NUMBER (*,0) NOT NULL,        -- row number - generated from a sequence  
worker_id NUMBER NOT NULL,           -- Attendance worker id  
date1 DATE DEFAULT SYSDATE NOT NULL, -- Attendance Date/time  
type1 NUMBER(3,0) NOT NULL,          -- Attendance type: 0-Enter, 1-Exit  

worker_id date1             type1  
2         13/06/2016-09:00  0  
3         13/06/2016-12:10  0  
2         13/06/2016-13:20  1  
2         13/06/2016-15:00  0  
2         13/06/2016-17:00  1  
3         13/06/2016-18:45  1  
2         13/06/2016-19:00  0  

如果报告在22:00运行

,则结果
worker_id date1      fr_hour to_hour hours  
2         13/06/2016 09:00   13:20   4:20  
2         13/06/2016 15:00   17:00   2:00  
2         13/06/2016 19:00   22:00   3:00  
3         13/06/2016 12:10   18:45   6:35  

2 个答案:

答案 0 :(得分:1)

在内部查询中,我们为每一行获取同一工作人员的下一行(LEAD 1)中的date1,type1,而不是仅过滤我们需要的内容:

SELECT worker_id, 
      TRUNC (date1) AS date1, 
      TO_CHAR (date1, 'HH24:MI') fr_hour, 
      TO_CHAR (date2, 'HH24:MI') to_hour, 
      TRUNC ( (date2 - date1) * 24) || ':' || 
        TO_CHAR (TRUNC ( (date2 - date1) * 24 * 60) - TRUNC ( (date2 - date1) * 24) * 60, '00') hours
  FROM (SELECT a.*, 
               LEAD (a.date1, 1) OVER (PARTITION BY worker_id ORDER BY date1) date2, 
               LEAD (a.type1, 1) OVER (PARTITION BY worker_id ORDER BY date1) type2
          FROM testtemp a)
 WHERE type1 = 0 
   AND type2 = 1 
   AND TRUNC (date1) = TRUNC (date2)

答案 1 :(得分:0)

连续一段时间从较早的一天开始会使其复杂化。您可以计算所有日期的所有范围等,回到开始时间 - 假设您没有归档旧记录,并且数据中任何工作人员的第一个条目不是签出 - 然后在您感兴趣的日期完成所有工作过滤后。或者您只能查看该日期的数据,并查看工作人员的记录是否以签到或退房开始。

我为第四名工人添加了记录:

 WORKER_ID DATE1                 TYPE1
---------- ---------------- ----------
         4 2016-06-12 19:00          0
         4 2016-06-13 03:00          1
         2 2016-06-13 09:00          0
         3 2016-06-13 12:10          0
         4 2016-06-13 13:00          0
         2 2016-06-13 13:20          1
         4 2016-06-13 14:30          1
         2 2016-06-13 15:00          0
         2 2016-06-13 17:00          1
         3 2016-06-13 18:45          1
         4 2016-06-13 19:00          0
         2 2016-06-13 19:00          0

您可以使用分析函数计算每个条目的行号,还可以找到当天每个工作人员的第一个type1值;这也有效地支持了作为单独列的进出时间:

select worker_id, trunc(date1) as date1, type1,
  case when type1 = 0 then date1 end as time_in,
  case when type1 = 1 then date1 end as time_out,
  row_number() over (partition by worker_id, trunc(date1), type1 order by date1) as rn,
  min(type1) keep (dense_rank first order by date1) over (partition by worker_id, trunc(date1)) as open_start,
  max(type1) keep (dense_rank last order by date1) over (partition by worker_id, trunc(date1)) as open_end,
  row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
    - case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
        over (partition by worker_id, trunc(date1)) else 0 end as grp
from attendance
where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
order by worker_id, attendance.date1;

 WORKER_ID DATE1                 TYPE1 TIME_IN          TIME_OUT                 RN OPEN_START   OPEN_END        GRP
---------- ---------------- ---------- ---------------- ---------------- ---------- ---------- ---------- ----------
         2 2016-06-13 00:00          0 2016-06-13 09:00                           1          0          0          1
         2 2016-06-13 00:00          1                  2016-06-13 13:20          1          0          0          1
         2 2016-06-13 00:00          0 2016-06-13 15:00                           2          0          0          2
         2 2016-06-13 00:00          1                  2016-06-13 17:00          2          0          0          2
         2 2016-06-13 00:00          0 2016-06-13 19:00                           3          0          0          3
         3 2016-06-13 00:00          0 2016-06-13 12:10                           1          0          1          1
         3 2016-06-13 00:00          1                  2016-06-13 18:45          1          0          1          1
         4 2016-06-13 00:00          1                  2016-06-13 03:00          1          1          0          0
         4 2016-06-13 00:00          0 2016-06-13 13:00                           1          1          0          1
         4 2016-06-13 00:00          1                  2016-06-13 14:30          2          1          0          1
         4 2016-06-13 00:00          0 2016-06-13 19:00                           2          1          0          2

rn列是原始(天真)尝试将记录分组在一起,但对于工作人员4来说是不合时宜的。如果第一张唱片是退房,则open_start可以解决。然后可以从rn中减去得到的值(0或1),以获得更有用的分组标记,我称之为grp

然后,您可以将其用作内联视图或CTE,并汇总每个组的时间输入/输出记录,添加nvl()coalesce()以放入缺少的午夜开始或10 pm-end值:

select worker_id,
  date1 as date1,
  nvl(min(time_in), date1) as fr_hour,
  nvl(max(time_out), date1 + 22/24) as to_hour,
  date1 + (nvl(max(time_out), date1 + 22/24) - date1)
    - (nvl(min(time_in), date1) - date1) as hours
from (
  select worker_id,
    trunc(date1) as date1,
    case when type1 = 0 then date1 end as time_in,
    case when type1 = 1 then date1 end as time_out,
    row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
      - case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
          over (partition by worker_id, trunc(date1)) else 0 end as grp
  from attendance
  where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
)
group by worker_id, date1, grp
order by worker_id, date1, grp;

 WORKER_ID DATE1            FR_HOUR          TO_HOUR          HOURS          
---------- ---------------- ---------------- ---------------- ----------------
         2 2016-06-13 00:00 2016-06-13 09:00 2016-06-13 13:20 2016-06-13 04:20
         2 2016-06-13 00:00 2016-06-13 15:00 2016-06-13 17:00 2016-06-13 02:00
         2 2016-06-13 00:00 2016-06-13 19:00 2016-06-13 22:00 2016-06-13 03:00
         3 2016-06-13 00:00 2016-06-13 12:10 2016-06-13 18:45 2016-06-13 06:35
         4 2016-06-13 00:00 2016-06-13 00:00 2016-06-13 03:00 2016-06-13 03:00
         4 2016-06-13 00:00 2016-06-13 13:00 2016-06-13 14:30 2016-06-13 01:30
         4 2016-06-13 00:00 2016-06-13 19:00 2016-06-13 22:00 2016-06-13 03:00

“小时”值操纵日期和时间输入/输出值以显示另一个时间;但它实际上是经过的时间。

最后,您可以格式化列以删除您不感兴趣的位:

select worker_id,
  to_char(date1, 'DD/MM/YYYY') as date1,
  to_char(nvl(min(time_in), date1), 'HH24:MI') as fr_hour,
  to_char(nvl(max(time_out), date1 + 22/24), 'HH24:MI') as to_hour,
  to_char(date1 + (nvl(max(time_out),date1 + 22/24) - date1)
    - (nvl(min(time_in), date1) - date1), 'HH24:MI') as hours
from (
  select worker_id,
    trunc(date1) as date1,
    case when type1 = 0 then date1 end as time_in,
    case when type1 = 1 then date1 end as time_out,
    row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
      - case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
          over (partition by worker_id, trunc(date1)) else 0 end as grp
  from attendance
  where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
)
group by worker_id, date1, grp
order by worker_id, date1, grp;

 WORKER_ID DATE1      FR_HO TO_HO HOURS
---------- ---------- ----- ----- -----
         2 13/06/2016 09:00 13:20 04:20
         2 13/06/2016 15:00 17:00 02:00
         2 13/06/2016 19:00 22:00 03:00
         3 13/06/2016 12:10 18:45 06:35
         4 13/06/2016 00:00 03:00 03:00
         4 13/06/2016 13:00 14:30 01:30
         4 13/06/2016 19:00 22:00 03:00