我有下表:
╔══════╦═══════════╦═════════╗
║ Emp# ║ StartDate ║ EndDate ║
╠══════╬═══════════╬═════════╣
║ 1 ║ 1Jan ║ 15Jan ║
║ 1 ║ 3Jan ║ 5Jan ║
║ 1 ║ 10Jan ║ 20Jan ║
║ 1 ║ 23Jan ║ 25Jan ║
║ 1 ║ 24Jan ║ 27Jan ║
╚══════╩═══════════╩═════════╝
我需要创建一个完全连接重叠的查询,以便每个可能的日历日期每个员工最多有一行。输出应如下:
╔══════╦═══════════╦═════════╗
║ Emp# ║ StartDate ║ EndDate ║
╠══════╬═══════════╬═════════╣
║ 1 ║ 1Jan ║ 20Jan ║
║ 1 ║ 23Jan ║ 27Jan ║
╚══════╩═══════════╩═════════╝
我尝试使用Self-Joins进行此操作,但我需要X重叠的X自连接。我希望找到解决方案的任何方向。 非常感谢你提前!
答案 0 :(得分:3)
这是一个想法:
exists
与case
一起使用。这种方法效果很好,但是当两个时间段具有相同的开始日期以开始重叠时段时需要稍微调整一下。所以:
select emp#, min(startdate) as startdate, max(enddate) as enddate
from (select t.*,
sum(OverlapFlag) over (partition by Emp# order by startdate) as grp
from (select t.*,
(case when exists (select 1
from t2
where t2.Emp# = t.Emp# and
t2.startdate < t.startdate and
t2.enddate + 1 >= t.startdate
)
then 0 else 1
end) as OverlapFlag
from t
) t
) t
group by emp#, grp;
答案 1 :(得分:0)
我会在这里使用PL-SQL。首先按StartDate对记录进行排序,然后每隔一个日期查看StartDate是否仍在给定的范围内。如果是,请检查EndDate是否扩展了范围。
这是包标题:
create or replace package mypackage as
type type_mytable is table of mytable%rowtype;
function get_ranges return type_mytable pipelined;
end mypackage;
包裹体:
create or replace package body mypackage as
function get_ranges return type_mytable pipelined as
v_current mytable%rowtype;
begin
for rec in
(
select *
from mytable
order by emp#, startdate
) loop
if rec.emp# = v_current.emp# and rec.startdate between v_current.startdate
and v_current.enddate + 1 then
if rec.enddate > v_current.enddate then
v_current.enddate := rec.enddate;
end if;
else
if v_current.emp# is not null then
pipe row(v_current);
end if;
v_current := rec;
end if;
end loop;
pipe row(v_current);
end get_ranges;
end mypackage;
调用该函数:
select * from table(mypackage.get_ranges) where emp# = 1;
答案 2 :(得分:0)
这是较旧的解决方案(来自其中一个评论),适用于纯日期。您可能希望比较此处提供的不同解决方案,以了解哪些解决方案对您的实际数据最有效;不同的解决方案可能最适合不同的情况。
注意:我使用了您的输入数据并为测试创建了更多。假设您的数据有效(所有日期都有效,它们的时间成分为00:00:00
,而enddate
始终大于或等于startdate
)。该解决方案不包括inputs
因子子查询,它仅在下面显示用于测试。我没有按emp#
和startdate
订购结果(在这方面输出可能会产生误导);如果您确实需要这样的排序,则需要明确添加它。请注意在测试数据中使用date
文字。输出显示当前会话设置中的日期;如果您需要特定格式,请使用所需显示格式模型to_date()
。
QUERY:
with
inputs ( emp#, startdate, enddate ) as (
select 1, date '2016-01-01', date '2016-01-15' from dual union all
select 1, date '2016-01-03', date '2016-01-05' from dual union all
select 1, date '2016-01-10', date '2016-01-20' from dual union all
select 1, date '2016-01-23', date '2016-01-25' from dual union all
select 1, date '2016-01-24', date '2016-01-27' from dual union all
select 2, date '2016-01-31', date '2016-02-28' from dual union all
select 2, date '2016-03-15', date '2016-03-18' from dual union all
select 2, date '2016-03-19', date '2016-03-19' from dual union all
select 2, date '2016-03-20', date '2016-03-20' from dual
),
m ( emp#, startdate, mdate ) as (
select emp#, startdate,
1 + max(enddate) over (partition by emp# order by startdate
rows between unbounded preceding and 1 preceding)
from inputs
union all
select emp#, NULL, 1 + max(enddate)
from inputs
group by emp#
),
n ( emp#, startdate, mdate ) as (
select emp#, startdate, mdate
from m
where startdate > mdate or startdate is null or mdate is null
),
f ( emp#, startdate, enddate ) as (
select emp#, startdate,
lead(mdate) over (partition by emp# order by startdate) - 1
from n
)
select * from f where startdate is not null
OUTPUT(对于inputs
CTE中的数据):
EMP# STARTDATE ENDDATE
------ ---------- ----------
1 01/01/2016 20/01/2016
1 23/01/2016 27/01/2016
2 31/01/2016 28/02/2016
2 15/03/2016 20/03/2016