场景:一家公司在许多州都有许多分支机构。一个州可能有一个以上的分支。每当员工从一个分支机构转移到另一个分支机构时,就会对如下表格进行输入
| EID | DT | BRANCH | STATE |
|-----|-------------|--------|-------|
| 1 | 01-JAN-2000 | A | AA |
| 1 | 01-JAN-2001 | B | AA |
| 1 | 01-JAN-2002 | C | AA |
| 1 | 01-JAN-2003 | D | AA |
| 1 | 01-JAN-2004 | E | BB |
| 1 | 01-JAN-2005 | F | BB |
| 1 | 01-JAN-2006 | G | BB |
| 1 | 01-JAN-2007 | H | BB |
| 1 | 01-JAN-2008 | A | AA |
| 1 | 01-JAN-2009 | B | AA |
| 1 | 01-JAN-2010 | C | AA |
| 1 | 01-JAN-2011 | D | AA |
要求是找出员工处于某种状态的持续时间。输出应该是这样的
| STATE | MIN | MAX | Duration |
|-------|-------------|-------------|-------------|
| AA | 01-JAN-2000 | 01-JAN-2003 | 3 |
| BB | 01-JAN-2004 | 01-JAN-2007 | 3 |
| AA | 01-JAN-2008 | 01-JAN-2011 | 3 |
我似乎无法弄清楚如何在PL / SQL中执行此操作。漫长的方法是使用for循环遍历每一行并找到持续时间。但是有没有办法在不使用循环的情况下在PLSQL中执行它?
这是一个SQLFiddle Demo
答案 0 :(得分:3)
WITH groups AS (
SELECT
t1.*,
ROW_NUMBER() OVER ( ORDER BY dt )
- ROW_NUMBER() OVER ( PARTITION BY state ORDER BY dt ) AS grp
FROM t1
)
SELECT state,
MIN( dt ) AS first_date,
MAX( dt ) AS last_date,
TRUNC( ( MAX( dt ) - MIN( dt ) ) / 365 ) AS duration
FROM groups
GROUP BY state, grp
ORDER BY first_date
<强> Results 强>:
| STATE | FIRST_DATE | LAST_DATE | DURATION |
|-------|--------------------------------|--------------------------------|----------|
| AA | January, 01 2000 00:00:00+0000 | January, 01 2003 00:00:00+0000 | 3 |
| BB | January, 01 2004 00:00:00+0000 | January, 01 2007 00:00:00+0000 | 3 |
| AA | January, 01 2008 00:00:00+0000 | January, 01 2011 00:00:00+0000 | 3 |
至于它如何运作:
groups
子查询选择每一行,并通过从任何{{1}的总行数中减去行state
的行数来将其分配给一个组。 } - 结果是:
state
和state
上的所有内容进行分组,并找到每个组中日期的grp
,min
和max
。答案 1 :(得分:2)
以下是完成任务的方法之一:
select max(z.state) as state
, min(z.dt) as min_date /* main query */
, max(z.dt) as max_date
, trunc((max(z.dt) - min(z.dt)) / 365) as duaration
from (select q.eid
, q.dt /* query # 2*/
, state
, sum(grp) over(order by q.dt) as grp
from (select eid
, dt
, state /* query # 1*/
, case
when state <> lag(state) over(order by dt)
then 1
end as grp
from t1 ) q
) z
group by z.grp
结果:
STATE MIN_DATE MAX_DATE DUARATION
----- ----------- ----------- ----------
AA 01-JAN-00 01-JAN-03 3
BB 01-JAN-04 01-JAN-07 3
AA 01-JAN-08 01-JAN-11 3
附录#1 :查询说明。
为了获得最小和最大日期,我们只需要应用group by
子句,这很明显,但我们不能,因为在AA
之前BB
状态之间存在逻辑差异和BB
状态之后的一个。因此,我们必须做一些事情来将它们分开,将它们放入不同的逻辑组中。这就是最内在的(/* query # 1*/
)和/* query # 2*/
所做的。查询#1查找状态更改的时刻(将当前行state
与前一行进行比较。lag() over()
函数用于引用数据集中的上一行),查询#2正在形成逻辑通过计算grp
的运行总计来计算分组(sum() over()
分析函数对此负责)。
查询#1给我们:
EID DT STATE GRP
---------- ----------- ----- ----------
1 01-JAN-2000 AA
1 01-JAN-2001 AA
1 01-JAN-2002 AA
1 01-JAN-2003 AA
1 01-JAN-2004 BB 1 --<-- moment when state changes
1 01-JAN-2005 BB
1 01-JAN-2006 BB
1 01-JAN-2007 BB
1 01-JAN-2008 AA 1 --<-- moment when state changes
1 01-JAN-2009 AA
1 01-JAN-2010 AA
1 01-JAN-2011 AA
查询#2形成逻辑组:
EID DT STATE GRP
---------- ----------- ----- ----------
1 01-JAN-2000 AA
1 01-JAN-2001 AA
1 01-JAN-2002 AA
1 01-JAN-2003 AA
1 01-JAN-2004 BB 1
1 01-JAN-2005 BB 1
1 01-JAN-2006 BB 1
1 01-JAN-2007 BB 1
1 01-JAN-2008 AA 2
1 01-JAN-2009 AA 2
1 01-JAN-2010 AA 2
1 01-JAN-2011 AA 2
然后,在主查询中,我们只是按GRP
分组以产生最终输出。
答案 2 :(得分:2)
好的,我更改了查询,但似乎无法正常工作:
with t2 as
(select t1.*,
case lag(state,1,state) over (order by dt)
when state then 0 else 1 end as state_chng
from t1),
t3 as
(select t2.*,
sum(state_chng) over (order by dt) as group_id
from t2)
select distinct state,
min(dt) over (partition by GROUP_ID) as min_dt,
max(dt) over (partition by GROUP_ID) as max_dt
from t3
order by 2;
| STATE | MIN_DT | MAX_DT |
|-------|--------------------------------|--------------------------------|
| AA | January, 01 2000 00:00:00+0000 | January, 01 2003 00:00:00+0000 |
| BB | January, 01 2004 00:00:00+0000 | January, 01 2008 00:00:00+0000 |
| AA | January, 01 2009 00:00:00+0000 | January, 01 2012 00:00:00+0000 |
| BB | January, 01 2013 00:00:00+0000 | January, 01 2014 00:00:00+0000 |
| AA | January, 01 2015 00:00:00+0000 | January, 01 2018 00:00:00+0000 |
答案 3 :(得分:1)
无存储过程,分析函数是实现此目的的唯一方法。
WITH s1 AS (
SELECT eid
, dt
, state
, CASE WHEN LAG(state)
OVER (PARTITION BY eid
ORDER BY dt)
= state
THEN NULL
ELSE dt
END mindt
, CASE WHEN LEAD(state)
OVER (PARTITION BY eid
ORDER BY dt)
= state
THEN NULL
ELSE dt
END maxdt
FROM t1
), s2 as (
select eid
, state
, MAX(mindt)
OVER (PARTITION BY eid
ORDER BY dt)
mindt
, MAX(maxdt)
OVER (PARTITION BY eid
ORDER BY dt)
maxdt
FROM s1
)
SELECT eid
, state
, mindt
, MAX(maxdt) maxdt
FROM s2
GROUP BY eid
, state
, mindt
ORDER BY eid
, mindt