How to add a flag to the rows after a specific value within a key in Oracle SQL?

时间:2018-09-18 20:21:30

标签: sql oracle lag analytic-functions lead

I have the following data:

Key Stage CreateDate
AAF 0     01-Jan-2018
AAF 0     02-Jan-2018
AAF 0     10-Jan-2018
AAF 20    20-Jan-2018
AAF 40    20-Mar-2018
AAF 0     01-May-2018
AAF 0     10-May-2018
AAF 0     20-May-2018
AAF 30    20-Jun-2018
AAF 0     20-Jul-2018   
AAF 100   20-Jul-2018       

I am basically trying to calculate the days spent at each stage.. I am currently taking the minimum date within each stage, and find the difference between the minimum date of the next stage:

select 
key,
stage,
cast(extract (day from max(next_dt) - min(createddate)) as number) as interval_days
from
(
select 
key,
stage,
createddate
lead(createddate,1) over (partition by  key order by createddate) next_dt
from  oppstages
)
group by key,stage 

As it can be seen, sometimes, the stage progresses from 0-40, but again goes back to 0. So the above logic doesn't work correctly and I am seeing a necessity to group the 0-40 as one category, and anything after 40 as next category and so on (if the stage decreases and restarts with new lesser stage number). The below query gives me the point where probability goes down, but I am not able flag to group the rows further.

select key,
stage,
createddate, 
next_dt,
next_prob,
case when   next_prob < stage  then 1   else 0 end as valid_flag,
from 
(
select 
key,
stage,
createddate,
lead(createddate,1) over (partition by  key order by createddate) next_dt, 
coalesce(lead(stage,1) over (partition by  key order by createddate),101) next_prob, 
from oppstages
) a

I expect this output so that I could group using flag to calculate the days spent at each instance:

Key Stage CreateDate    Flag
AAF 0     01-Jan-2018   1
AAF 0     02-Jan-2018   1
AAF 0     10-Jan-2018   1
AAF 20    20-Jan-2018   1
AAF 40    20-Mar-2018   1
AAF 0     01-May-2018   2
AAF 0     10-May-2018   2
AAF 0     20-May-2018   2
AAF 30    20-Jun-2018   2
AAF 10     20-Jul-2018   3
AAF 100   20-Jul-2018   3

thanks.

2 个答案:

答案 0 :(得分:2)

您可以尝试使用lag窗口函数获取先前的Stage值。

然后使用CASE WHEN检查PREVAL > STAGE确实增加1

CREATE TABLE T(
  Key varchar(50),
  Stage int,
  CreateDate date
);



INSERT INTO T VALUES ('AAF',0,TO_DATE('01-01-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',0,TO_DATE('02-01-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',0,TO_DATE('10-01-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',20,TO_DATE('20-01-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',40,TO_DATE('20-03-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',0,TO_DATE('01-05-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',0,TO_DATE('10-05-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',0,TO_DATE('20-05-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',30,TO_DATE('20-06-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',10,TO_DATE('20-07-2018','dd-mm-yyyy'));  
INSERT INTO T VALUES ('AAF',100,TO_DATE('20-07-2018','dd-mm-yyyy'));  

查询1

SELECT t1.KEY,
       t1.STAGE,
      (SUM(CASE WHEN PREVAL > STAGE THEN 1 ELSE 0 END) over (partition by Key order by CreateDate) + 1)  Flag
FROM (
  SELECT T.*,lag(Stage) over (partition by Key  order by CreateDate) preVAL
  FROM T 
)t1

Results

| KEY | STAGE | FLAG |
|-----|-------|------|
| AAF |     0 |    1 |
| AAF |     0 |    1 |
| AAF |     0 |    1 |
| AAF |    20 |    1 |
| AAF |    40 |    1 |
| AAF |     0 |    2 |
| AAF |     0 |    2 |
| AAF |     0 |    2 |
| AAF |    30 |    2 |
| AAF |    10 |    3 |
| AAF |   100 |    3 |

答案 1 :(得分:1)

您有一个“孤岛”问题。一个简单的解决方案使用行号的差异。这定义了组。

select t.*, (seqnum_2 - seqnum_1) as grp
from (select os.*,
             row_number() over (partition by key order by createdate) as seqnum,
             row_number() over (partition by key, stage order by createdate) as seqnum_2
      from oppstages os
     ) os;

您可能想要的是一个聚合:

select key, stage, min(createdate), max(createdate),
       lead(min(createdate)) over (partition by key, stage, seqnum - seqnum_2 order by createdate) as next_creatdate
from (select os.*,
             row_number() over (partition by key order by createdate) as seqnum,
             row_number() over (partition by key, stage order by createdate) as seqnum_2
      from oppstages os
     ) os
group by key, stage, (seqnum_2 - seqnum)

我不确定您在这段时间内需要什么逻辑,但这应该包含您需要的所有信息。

相关问题