查找一个字段更改时的开始日期和结束日期

时间:2013-03-26 23:49:08

标签: sql sliding-window

我在表格中有这些数据

FIELD_A   FIELD_B     FIELD_D
249052903   10/15/2011 N
249052903   11/15/2011 P ------------- VALUE CHANGED
249052903   12/15/2011 P
249052903   1/15/2012   N ------------- VALUE CHANGED
249052903   2/15/2012   N
249052903   3/15/2012   N
249052903   4/15/2012   N
249052903   5/15/2012   N
249052903   6/15/2012   N
249052903   7/15/2012   N
249052903   8/15/2012   N
249052903   9/15/2012   N

当FIELD_D中的值发生变化时,它会形成一个组,我需要该组中的最小和最大日期。查询应该返回

FIELD_A   GROUP_START   GROUP_END
249052903   10/15/2011  10/15/2011
249052903   11/15/2011  12/15/2011
249052903   1/15/2012              9/15/2012

到目前为止,我看到的示例中Field_D中的数据是唯一的。这里数据可以如图所示重复,首先它是“N”然后它变为“P”然后再变回“N”。

任何帮助将不胜感激

由于

4 个答案:

答案 0 :(得分:1)

使用子查询在SQL中表达相当容易:

select Field_A, Field_D, min(Field_B) as Group_Start, max(Field_B) as Group_End
from (select t.*,
             (select min(field_B)
              from t t2
              where t2.field_A = t.field_A and
                    t2.field_B > t.field_B and
                    t2.Field_D <> t.field_D
             ) as TheGroup
      from t
     ) t
group by Field_A, Field_D, TheGroup

这是使用相关子查询分配组标识符。标识符是Field_B的第一个值,其中Field_D发生了变化。

您没有提到您正在使用的数据库,因此这使用标准SQL。

答案 1 :(得分:1)

如果SQL实现支持,您可以使用分析函数 - LAG,LEAD和COUNT()OVER。 SQL小提琴here

WITH EndsMarked AS (
  SELECT
    FIELD_A,
    FIELD_B,
    CASE WHEN FIELD_D = LAG(FIELD_D,1) OVER (ORDER BY FIELD_B)
         THEN 0 ELSE 1 END AS IS_START,
    CASE WHEN FIELD_D = LEAD(FIELD_D,1) OVER (ORDER BY FIELD_B)
         THEN 0 ELSE 1 END AS IS_END
  FROM T
), GroupsNumbered AS (
  SELECT
    FIELD_A,
    FIELD_B,
    IS_START,
    IS_END,
    COUNT(CASE WHEN IS_START = 1 THEN 1 END)
      OVER (ORDER BY FIELD_B) AS GroupNum
  FROM EndsMarked
  WHERE IS_START=1 OR IS_END=1
)
  SELECT
    FIELD_A,
    MIN(FIELD_B) AS GROUP_START,
    MAX(FIELD_B) AS GROUP_END
    FROM GroupsNumbered
    GROUP BY FIELD_A, GroupNum;

答案 2 :(得分:0)

不要将SQL用于此问题,因为在单个表扫描的SQL中无法使用SQL,因为它需要在记录之间进行比较。它需要一个全表扫描加上至少一个自己的连接。使用命令式语言实现解决方案是微不足道的,它只需要一次表扫描。 编辑:存储过程最好。

答案 3 :(得分:0)

在您有多个Field_A的地方,我对答案做了一些修改。这应该总是有效的:-)

WITH EndsMarked 
AS 
(
    SELECT 

         [Field_A]
        ,[Field_B]
        ,CASE 
            WHEN LAG([Field_D],1) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) IS NULL
             AND ROW_NUMBER() OVER (PARTITION BY [Field_A] ORDER BY [Field_B]) = 1 
            THEN 1
            WHEN LAG([Field_D],1) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) > 0
              <> LAG([Field_D],0) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) > 0
            THEN 1 
            ELSE 0 
        END AS IS_START
       ,CASE 
            WHEN LEAD([Field_D],1) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) IS NULL
             AND ROW_NUMBER() OVER (PARTITION BY [Field_A] ORDER BY [Field_B] DESC) = 1 
            THEN 1
            WHEN LEAD([Field_D],0) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) 
              <> LEAD([Field_D],1) OVER (PARTITION BY [Field_A] ORDER BY [Field_A],[Field_B]) 
            THEN 1          
            ELSE 0 
        END                 AS IS_END

    FROM 
    (
        SELECT

            [Field_A]
           ,[Field_B]
           ,[Field_D]
           ,[Aantal Facturen]

        FROM [T]

    )   F
    
)
,GroupsNumbered 
AS 
(
  SELECT
     [Field_A]
    ,[Field_B]
    ,IS_START
    ,IS_END
    ,COUNT(CASE
               WHEN IS_START = 1 
               THEN 1 
           END)                     OVER (ORDER BY [Field_A]
                                                  ,[Field_B]) AS GroupNum
  FROM      EndsMarked
  WHERE     IS_START        = 1 
     OR     IS_END          = 1
)

    SELECT

        [Field_A]
        ,MIN([Field_B]) AS GROUP_START
        ,MAX([Field_B]) AS GROUP_END

    FROM GroupsNumbered

    GROUP BY [Field_A], GroupNum