我需要计算oracle数据库中员工的时间跨度。 我需要实现如下逻辑:
案例1
Jan 1 - Jan 5
Jan 6 - Jan 10
AS 1月5日到1月6日之间的时间间隔仅为一天,总产出 应该是
Jan 1 - Jan 10
案例2
Jan 1 - Jan 5
Jan 7 - Jan 10
AS 1月5日到1月7日之间的时间间隔超过一天,总产量 应该是
Jan 1 - Jan 5
Jan 7 - Jan 10
每位员工可以有任意数量的行。我知道可以做到 通过使用超前/滞后功能,但无法解决它。任何人都可以帮助我吗?
我使用的样本数据如下:
empid FROMDATE TODATE
===== ======== ======
1 01.01.2013 03.01.2013
1 02.01.2013 05.01.2013
2 01.01.2013 04.01.2013
2 02.01.2013 03.01.2013
2 02.01.2013 06.01.2013
3 01.01.2013 02.01.2013
3 04.01.2013 06.01.2013
3 01.01.2013 04.01.2013
4 01.01.2013 03.01.2013
4 04.01.2013 06.01.2013
5 01.01.2013 06.01.2013
5 01.01.2013 02.01.2013
5 02.01.2013 05.01.2013
5 03.01.2013 04.01.2013
6 01.01.2013 02.01.2013
6 02.01.2013 03.01.2013
6 05.01.2013 06.01.2013
6 05.01.2013 07.01.2013
如果是empid 1-5,从日期和最大值到最小值的分钟给出了解决方案,我对那个差距超过1天的empid 6的情况感到困惑。
答案 0 :(得分:1)
日期重叠,尤其是某些日期范围完全在其他日期范围内,这使得这一点变得复杂,正如Jeffrey Kemp指出的那样。处理这种情况的最简单方法可能是将所有范围分解为所有日期,然后将它们组合回不同的范围。如果你有11gR2,爆炸它们的一种方法是使用递归子查询因子分解(CTE):
with r (empid, onedate, todate) as (
select empid, fromdate, todate
from t42
union all
select empid, onedate + 1, todate
from r
where onedate < todate
)
...
这将生成所有员工的所有日期;但由于重叠,它有重复,所以你可以消除这些:
...,
s as (
select distinct empid, onedate
from r
)
...
然后您回到使用lead
和lag
来查看连续范围。这可以压缩一点,但我已经离开了这样,所以更容易(我希望)遵循逻辑)。首先找到员工的上一个和下一个日期:
...,
t as (
select empid, onedate,
lag(onedate) over (partition by empid order by onedate) as lagdate,
lead(onedate) over (partition by empid order by onedate) as leaddate
from s
)
...
并且有效地删除了中档范围:
...,
u as (
select empid, onedate, lagdate, leaddate,
case when lagdate is null or lagdate < onedate - 1 then onedate end
as fromdate,
case when leaddate is null or leaddate > onedate + 1 then onedate end
as todate
from t
)
...
最后再次使用lead
和lag
折叠您已经离开的计算行 - 您可以这样做,因为'from'和'to'记录是相邻的,如果我们消除所有mid - 范围值:
select distinct empid,
case when fromdate is null then lag(fromdate)
over (partition by empid order by onedate) else fromdate end as fromdate,
case when todate is null then lead(todate)
over (partition by empid order by onedate) else todate end as todate
from u
where fromdate is not null
or todate is not null
order by empid, fromdate;
所以把它们放在一起:
with r (empid, onedate, todate) as (
select empid, fromdate, todate
from t42
union all
select empid, onedate + 1, todate
from r
where onedate < todate
),
s as (
select distinct empid, onedate
from r
),
t as (
select empid, onedate,
lag(onedate) over (partition by empid order by onedate) as lagdate,
lead(onedate) over (partition by empid order by onedate) as leaddate
from s
),
u as (
select empid, onedate, lagdate, leaddate,
case when lagdate is null or lagdate < onedate - 1 then onedate end
as fromdate,
case when leaddate is null or leaddate > onedate + 1 then onedate end
as todate
from t
)
select distinct empid,
case when fromdate is null then lag(fromdate)
over (partition by empid order by onedate) else fromdate end as fromdate,
case when todate is null then lead(todate)
over (partition by empid order by onedate) else todate end as todate
from u
where fromdate is not null
or todate is not null
order by empid, fromdate;
......给出:
EMPID FROMDATE TODATE
---------- ---------- ----------
1 2013-01-01 2013-01-05
2 2013-01-01 2013-01-06
3 2013-01-01 2013-01-06
4 2013-01-01 2013-01-06
5 2013-01-01 2013-01-06
6 2013-01-01 2013-01-03
6 2013-01-05 2013-01-07
7 rows selected
这适用于11.2.0.3,但递归CTE似乎给出了SQL Fiddle的错误答案,即11.2.0.2 - 所以不确定是否会看到错误。而且无论如何你都不能在以前的版本中使用它。使用connect by
从多行扩展范围很棘手,我正在尝试避免使用某个函数,但您可以这样做:
with r as (
select mindate + level - 1 as onedate
from (
select min(fromdate) as mindate, max(todate) as maxdate
from t42
)
connect by level <= maxdate - mindate + 1
),
s as (
select distinct t.empid, r.onedate
from r
join t42 t on r.onedate between t.fromdate and t.todate
)
...
使用其余的CTE和上面的查询that does work on SQL Fiddle,并生成相同的输出。而它至少可以回到10克。 This Fiddle显示它分为几个阶段,因此您可以看到数据在每个点的操作方式。