Question

我需要间隔超标的帮助。我将这些记录放在一个表格中（以及更多）：

示例1：

Id---------StartDate------EndDate

794122    2011-05-10    2999-12-31

794122    2011-04-15    2999-12-31

794122    2008-04-03    2999-12-31

794122    2008-03-31    2999-12-31

794122    2008-02-29    2999-12-31

794122    2008-02-04    2999-12-31

794122    2007-10-10    2999-12-31

794122    2007-09-15    2999-12-31

示例2：

Id---------StartDate------EndDate

5448    2012-12-28      2999-12-31

5448    2011-06-30      2999-12-31

5448    2005-12-26      2011-06-30

5448    2005-06-15      2011-06-30

5448    2006-07-31      2006-12-31

5448    2001-03-31      2006-07-15

示例3：

Id---------StartDate------EndDate

214577    2007-02-28    2999-12-31

214577    2003-06-20    2007-03-04

214577    2003-06-20    2007-02-28

示例4：

Id---------StartDate-------EndDate

9999    2008-05-28      2999-01-01

9999    2005-03-03      2008-05-31

9999    2005-05-31      2005-12-31

9999    2003-12-01      2005-08-12

9999    2001-01-01      2002-03-05

9999    2000-01-08      2002-01-01

我想得到：

*Example1* - 2007-09-15->3000-01-01

*Example2* - 2001-03-31->3000-01-01

*Example3* - 2003-06-20->3000-01-01

*Example4* - 2003-12-01->3000-01-01

你有什么建议我这样做吗？因为我不选择最大值和最小值（按ID分组） - ＆gt;此问题在示例4中。

谢谢！

Answer 1

示例＃4的结果与您的数据不匹配，不应该是9999,2999-01-02而不是3000-01-01？

组合重叠时段的典型解决方案使用嵌套的OLAP函数，根据您的特定要求（仅限最新时期），它可以简化为：

SELECT *
FROM
 (
   SELECT DISTINCT -- DISTINCT is not neccessary, but results in a better plan
      Id,
      StartDate,
      MAX(EndDate) 
      OVER (PARTITION BY Id) + 1 AS EndDate
   FROM dropme AS t
   QUALIFY -- find the gap
      COALESCE(StartDate 
               - MAX(EndDate) 
                 OVER (PARTITION BY Id
                       ORDER BY StartDate, EndDate
                       ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 1) > 0
 ) AS dt
QUALIFY 
   ROW_NUMBER() 
   OVER (PARTITION BY Id
         ORDER BY StartDate DESC) = 1
;

Answer 2

你只是想这样做吗？

select id, min(start_date) as start_date, max(end_date) as end_date
from t
group by id;

编辑：

现在我明白了你的需要。它标识开始新时段的行（使用not exists子句查找重叠）。然后，它为每个id选择这些行中的最大start_date：

select t.id, min(t.start_date) as start_date, max(t.end_date) as end_date
from (select id, max(start_date) as maxsd
      from t
      where not exists (select 1
                        from t t2
                        where t2.start_date < t.start_date and
                              t2.end_date >= t.start_date
                       )
      group by id
     ) ids join
     t
     on t.id = ids.id and
        t.start_date >= maxsd
group by t.id;

最后一步加入到原始数据并对开始日期之后开始的任何内容进行聚合。

Answer 3

您希望结束日期是下一年的第一天吗？

select id, min(startdate) start_date, 
       cast(max(extract(year from enddate)) + 1 || '-01-01' as date) end_date
from table1
group by id

时间间隔重叠 - teradata

3 个答案: