Question

我有下表：

 ╔══════╦═══════════╦═════════╗
 ║ Emp# ║ StartDate ║ EndDate ║
 ╠══════╬═══════════╬═════════╣
 ║    1 ║ 1Jan      ║ 15Jan   ║
 ║    1 ║ 3Jan      ║ 5Jan    ║
 ║    1 ║ 10Jan     ║ 20Jan   ║
 ║    1 ║ 23Jan     ║ 25Jan   ║
 ║    1 ║ 24Jan     ║ 27Jan   ║
 ╚══════╩═══════════╩═════════╝

我需要创建一个完全连接重叠的查询，以便每个可能的日历日期每个员工最多有一行。输出应如下：

 ╔══════╦═══════════╦═════════╗
 ║ Emp# ║ StartDate ║ EndDate ║
 ╠══════╬═══════════╬═════════╣
 ║    1 ║ 1Jan      ║ 20Jan   ║
 ║    1 ║ 23Jan     ║ 27Jan   ║
 ╚══════╩═══════════╩═════════╝

我尝试使用Self-Joins进行此操作，但我需要X重叠的X自连接。我希望找到解决方案的任何方向。非常感谢你提前！

Answer 1

这是一个想法：

确定小组的开始位置。为此，请将exists与case一起使用。
为这些日期分配一个标志。
累积该标志，以便所有重叠的时间段具有相同的值。
将此用于聚合

这种方法效果很好，但是当两个时间段具有相同的开始日期以开始重叠时段时需要稍微调整一下。所以：

select emp#, min(startdate) as startdate, max(enddate) as enddate
from (select t.*,
             sum(OverlapFlag) over (partition by Emp# order by startdate) as grp
      from (select t.*,
                   (case when exists (select 1
                                      from t2
                                      where t2.Emp# = t.Emp# and
                                            t2.startdate < t.startdate and
                                            t2.enddate + 1 >= t.startdate
                                     )
                         then 0 else 1
                    end) as OverlapFlag
            from t
           ) t
     ) t
group by emp#, grp;

Answer 2

我会在这里使用PL-SQL。首先按StartDate对记录进行排序，然后每隔一个日期查看StartDate是否仍在给定的范围内。如果是，请检查EndDate是否扩展了范围。

这是包标题：

create or replace package mypackage as
  type type_mytable is table of mytable%rowtype;
  function get_ranges return type_mytable pipelined;
end mypackage;

包裹体：

create or replace package body mypackage as
  function get_ranges return type_mytable pipelined as
    v_current mytable%rowtype;
  begin
    for rec in
    (
       select *
       from mytable
       order by emp#, startdate
    ) loop
      if rec.emp# = v_current.emp# and rec.startdate between v_current.startdate 
                                                         and v_current.enddate + 1 then
        if rec.enddate >  v_current.enddate then
          v_current.enddate := rec.enddate;
        end if;
      else
        if v_current.emp# is not null then
          pipe row(v_current);
        end if;
        v_current := rec;
      end if;
    end loop;
    pipe row(v_current);
  end get_ranges;
end mypackage;

调用该函数：

select * from table(mypackage.get_ranges) where emp# = 1;

Answer 3

这是较旧的解决方案（来自其中一个评论），适用于纯日期。您可能希望比较此处提供的不同解决方案，以了解哪些解决方案对您的实际数据最有效;不同的解决方案可能最适合不同的情况。

注意：我使用了您的输入数据并为测试创建了更多。假设您的数据有效（所有日期都有效，它们的时间成分为00:00:00，而enddate始终大于或等于startdate）。该解决方案不包括inputs因子子查询，它仅在下面显示用于测试。我没有按emp#和startdate订购结果（在这方面输出可能会产生误导）;如果您确实需要这样的排序，则需要明确添加它。请注意在测试数据中使用date文字。输出显示当前会话设置中的日期;如果您需要特定格式，请使用所需显示格式模型to_date()。

QUERY：

with
     inputs ( emp#, startdate, enddate ) as (
       select 1, date '2016-01-01', date '2016-01-15' from dual union all
       select 1, date '2016-01-03', date '2016-01-05' from dual union all
       select 1, date '2016-01-10', date '2016-01-20' from dual union all
       select 1, date '2016-01-23', date '2016-01-25' from dual union all
       select 1, date '2016-01-24', date '2016-01-27' from dual union all
       select 2, date '2016-01-31', date '2016-02-28' from dual union all
       select 2, date '2016-03-15', date '2016-03-18' from dual union all
       select 2, date '2016-03-19', date '2016-03-19' from dual union all
       select 2, date '2016-03-20', date '2016-03-20' from dual
     ),
     m ( emp#, startdate, mdate ) as (
         select     emp#, startdate,
                    1 + max(enddate) over (partition by emp# order by startdate 
                             rows between unbounded preceding and 1 preceding)
         from       inputs
         union all
         select     emp#, NULL, 1 + max(enddate) 
           from     inputs 
           group by emp#
     ),
     n ( emp#, startdate, mdate ) as (
         select emp#, startdate, mdate 
         from   m 
         where  startdate > mdate or startdate is null or mdate is null
     ),
     f ( emp#, startdate, enddate ) as (
         select emp#, startdate,
                lead(mdate) over (partition by emp# order by startdate) - 1
         from   n
     )
select * from f where startdate is not null

OUTPUT（对于inputs CTE中的数据）：

  EMP# STARTDATE  ENDDATE          
------ ---------- ----------
     1 01/01/2016 20/01/2016
     1 23/01/2016 27/01/2016
     2 31/01/2016 28/02/2016
     2 15/03/2016 20/03/2016

ORACLE SQL - 连接重叠的时间范围

3 个答案: