如何获得每行最小最大日期类型

时间:2013-11-30 10:36:41

标签: sql oracle plsql

场景:一家公司在许多州都有许多分支机构。一个州可能有一个以上的分支。每当员工从一个分支机构转移到另一个分支机构时,就会对如下表格进行输入

| EID |          DT | BRANCH | STATE |
|-----|-------------|--------|-------|
|   1 | 01-JAN-2000 |      A |    AA |
|   1 | 01-JAN-2001 |      B |    AA |
|   1 | 01-JAN-2002 |      C |    AA |
|   1 | 01-JAN-2003 |      D |    AA |
|   1 | 01-JAN-2004 |      E |    BB |
|   1 | 01-JAN-2005 |      F |    BB |
|   1 | 01-JAN-2006 |      G |    BB |
|   1 | 01-JAN-2007 |      H |    BB |
|   1 | 01-JAN-2008 |      A |    AA |
|   1 | 01-JAN-2009 |      B |    AA |
|   1 | 01-JAN-2010 |      C |    AA |
|   1 | 01-JAN-2011 |      D |    AA |

要求是找出员工处于某种状态的持续时间。输出应该是这样的

| STATE |         MIN |         MAX |    Duration |
|-------|-------------|-------------|-------------|
|    AA | 01-JAN-2000 | 01-JAN-2003 |           3 |
|    BB | 01-JAN-2004 | 01-JAN-2007 |           3 |
|    AA | 01-JAN-2008 | 01-JAN-2011 |           3 |

我似乎无法弄清楚如何在PL / SQL中执行此操作。漫长的方法是使用for循环遍历每一行并找到持续时间。但是有没有办法在不使用循环的情况下在PLSQL中执行它?

这是一个SQLFiddle Demo

4 个答案:

答案 0 :(得分:3)

SQL Fiddle

WITH groups AS (
  SELECT
    t1.*,
    ROW_NUMBER() OVER ( ORDER BY dt )
      - ROW_NUMBER() OVER ( PARTITION BY state ORDER BY dt ) AS grp
  FROM t1
)
SELECT state,
       MIN( dt ) AS first_date,
       MAX( dt ) AS last_date,
       TRUNC( ( MAX( dt ) - MIN( dt ) ) / 365 ) AS duration
FROM   groups
GROUP BY state, grp
ORDER BY first_date

<强> Results

| STATE |                     FIRST_DATE |                      LAST_DATE | DURATION |
|-------|--------------------------------|--------------------------------|----------|
|    AA | January, 01 2000 00:00:00+0000 | January, 01 2003 00:00:00+0000 |        3 |
|    BB | January, 01 2004 00:00:00+0000 | January, 01 2007 00:00:00+0000 |        3 |
|    AA | January, 01 2008 00:00:00+0000 | January, 01 2011 00:00:00+0000 |        3 |

至于它如何运作:

  • groups子查询选择每一行,并通过从任何{{1}的总行数中减去行state的行数来将其分配给一个组。 } - 结果是:
    • 具有相同状态的任何连续行序列将具有相同的组号;和
    • 对于任何给定的状态,随着日期的增加,每组行将具有增加的组编号(这在比较不同状态的组时不一定适用,但这在最终位中使用的分组无关紧要)。
  • 然后,最终查询会对statestate上的所有内容进行分组,并找到每个组中日期的grpminmax

答案 1 :(得分:2)

以下是完成任务的方法之一:

select max(z.state) as state
     , min(z.dt)    as min_date   /* main query */
     , max(z.dt)    as max_date
     , trunc((max(z.dt) - min(z.dt)) / 365) as duaration
  from (select q.eid
             , q.dt              /* query # 2*/
             , state 
             , sum(grp) over(order by q.dt) as grp
          from (select eid
                     , dt
                     , state     /* query # 1*/
                     , case
                         when state <> lag(state) over(order by dt)
                         then 1
                       end as grp 
                  from t1 ) q
       ) z
  group by z.grp

结果:

STATE MIN_DATE    MAX_DATE     DUARATION
----- ----------- ----------- ----------
AA    01-JAN-00   01-JAN-03            3
BB    01-JAN-04   01-JAN-07            3
AA    01-JAN-08   01-JAN-11            3

SQLFiddle Demo


附录#1 :查询说明。

为了获得最小和最大日期,我们只需要应用group by子句,这很明显,但我们不能,因为在AA之前BB状态之间存在逻辑差异和BB状态之后的一个。因此,我们必须做一些事情来将它们分开,将它们放入不同的逻辑组中。这就是最内在的(/* query # 1*/)和/* query # 2*/所做的。查询#1查找状态更改的时刻(将当前行state与前一行进行比较。lag() over()函数用于引用数据集中的上一行),查询#2正在形成逻辑通过计算grp的运行总计来计算分组(sum() over()分析函数对此负责)。

查询#1给我们:

       EID DT           STATE        GRP
---------- -----------  -----    ----------
         1 01-JAN-2000   AA    
         1 01-JAN-2001   AA    
         1 01-JAN-2002   AA    
         1 01-JAN-2003   AA    
         1 01-JAN-2004   BB           1  --<-- moment when state changes
         1 01-JAN-2005   BB    
         1 01-JAN-2006   BB    
         1 01-JAN-2007   BB    
         1 01-JAN-2008   AA           1  --<-- moment when state changes
         1 01-JAN-2009   AA    
         1 01-JAN-2010   AA    
         1 01-JAN-2011   AA    

查询#2形成逻辑组:

       EID DT           STATE        GRP
---------- -----------  -----    ----------
         1 01-JAN-2000   AA    
         1 01-JAN-2001   AA    
         1 01-JAN-2002   AA    
         1 01-JAN-2003   AA    
         1 01-JAN-2004   BB           1   
         1 01-JAN-2005   BB           1
         1 01-JAN-2006   BB           1
         1 01-JAN-2007   BB           1 
         1 01-JAN-2008   AA           2 
         1 01-JAN-2009   AA           2
         1 01-JAN-2010   AA           2
         1 01-JAN-2011   AA           2

然后,在主查询中,我们只是按GRP分组以产生最终输出。

答案 2 :(得分:2)

好的,我更改了查询,但似乎无法正常工作:

with t2 as
(select t1.*,
  case lag(state,1,state) over (order by dt)
  when state then 0 else 1 end as state_chng
from t1),
t3 as 
  (select t2.*,
    sum(state_chng) over (order by dt) as group_id
  from t2)
select distinct state,
  min(dt) over (partition by GROUP_ID) as min_dt,
  max(dt) over (partition by GROUP_ID) as max_dt
from t3
order by 2;

| STATE |                         MIN_DT |                         MAX_DT |
|-------|--------------------------------|--------------------------------|
|    AA | January, 01 2000 00:00:00+0000 | January, 01 2003 00:00:00+0000 |
|    BB | January, 01 2004 00:00:00+0000 | January, 01 2008 00:00:00+0000 |
|    AA | January, 01 2009 00:00:00+0000 | January, 01 2012 00:00:00+0000 |
|    BB | January, 01 2013 00:00:00+0000 | January, 01 2014 00:00:00+0000 |
|    AA | January, 01 2015 00:00:00+0000 | January, 01 2018 00:00:00+0000 |

答案 3 :(得分:1)

无存储过程,分析函数是实现此目的的唯一方法。

WITH s1 AS (
SELECT eid
     , dt
     , state 
     , CASE WHEN LAG(state) 
                 OVER (PARTITION BY eid 
                           ORDER BY dt) 
                 = state           
            THEN NULL 
            ELSE dt 
       END mindt
     , CASE WHEN LEAD(state) 
                 OVER (PARTITION BY eid 
                           ORDER BY dt) 
                 = state           
            THEN NULL 
            ELSE dt 
       END maxdt
  FROM t1
), s2 as (
select eid
     , state
     , MAX(mindt) 
       OVER (PARTITION BY eid 
              ORDER BY dt) 
       mindt
     , MAX(maxdt) 
       OVER (PARTITION BY eid 
                 ORDER BY dt) 
       maxdt
  FROM s1
)
SELECT eid
     , state
     , mindt
     , MAX(maxdt) maxdt
  FROM s2
 GROUP BY eid
     , state
     , mindt
 ORDER BY eid
     , mindt