Oracle SQL:检测连续跨度中断

时间:2011-01-25 17:09:44

标签: sql oracle

我有下表,我正在尝试检测其跨度中断的产品。

Product     | unit_Cost | price start date |    price end date
--------------------------------------------------------------------------
product 1     15.00         01/01/2011      03/31/2011
product 1     15.00         04/01/2011      06/31/2011
product 1     15.00         07/01/2011      09/31/2011
product 1     15.00         10/01/2011      12/31/2011

product 2     10.00         01/01/2011      12/31/2011

product 3     25.00         01/01/2011      06/31/2011
product 3     25.00         10/01/2011      12/31/2011

所以在这里我希望它报告产品3,因为我们缺少范围

07/01/2011 - 09/31/2011

关于我如何做到这一点的任何想法?

编辑:     Oracle Ver:10g

Create Table Statement

CREATE TABLE Sandbox.TBL_PRODUCT
(
  PRODUCT_ID        VARCHAR2(13 BYTE),   
  PRODUCT           VARCHAR2(64 BYTE),
  UNIT_COST         NUMBER,
  PRICE_START_DATE  DATE,
  PRICE_END_DATE    DATE
)

编辑2     开始日期和结束日期不能重叠

编辑3     只要price_end_date> = price_start_date,跨度就可以是任意两个日期。由于产品可以在一天内销售,因此包含相同的费用。

5 个答案:

答案 0 :(得分:2)

试试这个(使用LEAD分析功能):

SELECT *
  FROM (
                SELECT a.*, LEAD(price_start_date,1,NULL) OVER(PARTITION BY product ORDER BY price_end_date) next_start_date 
         FROM Product a
       )
WHERE (price_end_date + 1)<> next_start_date

设置示例

        CREATE TABLE PRODUCT
          (
            PRODUCT   VARCHAR2(100 BYTE),
            UNIT_COST NUMBER,
            START_DATE DATE,
            END_DATE DATE
          );

        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('03/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('04/01/2011','MM/DD/RRRR'),TO_DATE('06/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('07/01/2011','MM/DD/RRRR'),TO_DATE('09/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('10/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 2','10.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 3','25.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('06/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 3','25.00',TO_DATE('10/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));

SELECT *
  FROM (
                SELECT a.*, LEAD(start_date,1,NULL) OVER(PARTITION BY product ORDER BY start_date) next_start_date 
                 FROM Product a
              )
WHERE (end_date + 1)<> next_start_date

编辑:更新了查询以考虑下一个start_date和当前的end_date,以避免数据分发出现问题。

答案 1 :(得分:1)

你也可以使用这种技术。它使用内部查询(chronological_record)为TBL_PRODUCT表中的每个记录分配排名(排名在每个start_date内的product上排序)。

WITH
  chronological_record AS
  (
    SELECT
      product,
      unit_cost,
      start_date,
      end_date,
      (DENSE_RANK() OVER (PARTITION BY product ORDER BY start_date))
          AS chronological_order
    FROM
      TBL_PRODUCT
  )

SELECT
  earlier.product,
  (earlier.end_date + 1) AS missing_period_start_date,
  (later.start_date - 1) as missing_period_end_date
FROM
  CHRONOLOGICAL_RECORD earlier
  INNER JOIN
  CHRONOLOGICAL_RECORD later
    ON
        earlier.product = later.product
      AND
        (earlier.chronological_order + 1) = later.chronological_order
WHERE
  (earlier.end_date + 1) <> later.start_date

在您的示例中,子查询(chronological_record)将产生如下内容:

Product   | unit_Cost | start date | end date   | chronological_order
--------------------------------------------------------------------------
product 1    15.00      01/01/2011   03/31/2011    1
product 1    15.00      04/01/2011   06/31/2011    2
product 1    15.00      07/01/2011   09/31/2011    3
product 1    15.00      10/01/2011   12/31/2011    4

product 2    10.00      01/01/2011   12/31/2011    1

product 3    25.00      01/01/2011   06/31/2011    1
product 3    25.00      10/01/2011   12/31/2011    2

主要查询的INNER JOIN有效地将早期记录与其下一个(按时间顺序排列)的记录进行匹配。

答案 2 :(得分:1)

假设您的表名为products,您的开始日期列名为s,结束日期列名为e

create view max_interval as 
select product, 
max(e) - min(s) as max_interval 
from products group by product;


create view total_days as 
select product, 
sum( e - s ) + count(product) - 1 as total_days 
from products group by product  ;

然后,此查询为您提供所有“缺失”范围的产品:

select a.*, b.*
from max_interval a 
left outer join total_days b 
on (a.product = b.product)
where a.max_interval <> b.total_days;

由于两个视图中的group by是相同的,因此当然可以将它组合成一个查询,尽管使解决方案不那么明确:

select product, 
max(e) - min(s) as max_interval, 
sum( e - s ) + count(product) - 1 as total_days 
from products group by product  
having max(e) - min(s) <> sum( e - s ) + count(product) - 1;

但正如Stephanie Page指出的那样,这是一个不成熟的优化;你不太可能经常在连续的跨度中扫描休息。

答案 3 :(得分:0)

您可以使用exists子句来过滤存在较早行的行,并使用not exist子句来查找上一行不在当前行加一天的行。例如:

select  *
from    TBL_PRODUCT t1
where   exists
        (
        select  *
        from    TBL_PRODUCT t2
        where   t2.PRODUCT = t1.PRODUCT
                and t2.PRICE_END_DATE < t1.PRICE_START_DATE
        )
        and not exists
        (
        select  *
        from    TBL_PRODUCT t3
        where   t3.PRODUCT = t1.PRODUCT
                and t3.PRICE_END_DATE + 1 = t1.PRICE_START_DATE
        );

打印:

PRODUCT          UNIT_COST PRICE_STA PRICE_END
----------------------- ---------- --------- ---------
product 3           25 01-OCT-11 31-DEC-11

答案 4 :(得分:0)

您可以对范围进行一些数学比较,假设您修复了样本集中的错误日期:

SELECT PRODUCT
FROM Sandbox.TBL_PRODUCT
HAVING SUM(PRICE_END_DATE - PRICE_START_DATE + 1) < MAX(PRICE_END_DATE) - MIN(PRICE_START_DATE) + 1
GROUP BY PRODUCT

哪会回来:

PRODUCT                                                                         
-----------------
product 3                                                                       
1 row selected