I have the following data:
Key Stage CreateDate
AAF 0 01-Jan-2018
AAF 0 02-Jan-2018
AAF 0 10-Jan-2018
AAF 20 20-Jan-2018
AAF 40 20-Mar-2018
AAF 0 01-May-2018
AAF 0 10-May-2018
AAF 0 20-May-2018
AAF 30 20-Jun-2018
AAF 0 20-Jul-2018
AAF 100 20-Jul-2018
I am basically trying to calculate the days spent at each stage.. I am currently taking the minimum date within each stage, and find the difference between the minimum date of the next stage:
select
key,
stage,
cast(extract (day from max(next_dt) - min(createddate)) as number) as interval_days
from
(
select
key,
stage,
createddate
lead(createddate,1) over (partition by key order by createddate) next_dt
from oppstages
)
group by key,stage
As it can be seen, sometimes, the stage progresses from 0-40, but again goes back to 0. So the above logic doesn't work correctly and I am seeing a necessity to group the 0-40 as one category, and anything after 40 as next category and so on (if the stage decreases and restarts with new lesser stage number). The below query gives me the point where probability goes down, but I am not able flag to group the rows further.
select key,
stage,
createddate,
next_dt,
next_prob,
case when next_prob < stage then 1 else 0 end as valid_flag,
from
(
select
key,
stage,
createddate,
lead(createddate,1) over (partition by key order by createddate) next_dt,
coalesce(lead(stage,1) over (partition by key order by createddate),101) next_prob,
from oppstages
) a
I expect this output so that I could group using flag to calculate the days spent at each instance:
Key Stage CreateDate Flag
AAF 0 01-Jan-2018 1
AAF 0 02-Jan-2018 1
AAF 0 10-Jan-2018 1
AAF 20 20-Jan-2018 1
AAF 40 20-Mar-2018 1
AAF 0 01-May-2018 2
AAF 0 10-May-2018 2
AAF 0 20-May-2018 2
AAF 30 20-Jun-2018 2
AAF 10 20-Jul-2018 3
AAF 100 20-Jul-2018 3
thanks.
答案 0 :(得分:2)
您可以尝试使用lag
窗口函数获取先前的Stage
值。
然后使用CASE WHEN
检查PREVAL > STAGE
确实增加1
。
CREATE TABLE T(
Key varchar(50),
Stage int,
CreateDate date
);
INSERT INTO T VALUES ('AAF',0,TO_DATE('01-01-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',0,TO_DATE('02-01-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',0,TO_DATE('10-01-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',20,TO_DATE('20-01-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',40,TO_DATE('20-03-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',0,TO_DATE('01-05-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',0,TO_DATE('10-05-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',0,TO_DATE('20-05-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',30,TO_DATE('20-06-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',10,TO_DATE('20-07-2018','dd-mm-yyyy'));
INSERT INTO T VALUES ('AAF',100,TO_DATE('20-07-2018','dd-mm-yyyy'));
查询1 :
SELECT t1.KEY,
t1.STAGE,
(SUM(CASE WHEN PREVAL > STAGE THEN 1 ELSE 0 END) over (partition by Key order by CreateDate) + 1) Flag
FROM (
SELECT T.*,lag(Stage) over (partition by Key order by CreateDate) preVAL
FROM T
)t1
Results :
| KEY | STAGE | FLAG |
|-----|-------|------|
| AAF | 0 | 1 |
| AAF | 0 | 1 |
| AAF | 0 | 1 |
| AAF | 20 | 1 |
| AAF | 40 | 1 |
| AAF | 0 | 2 |
| AAF | 0 | 2 |
| AAF | 0 | 2 |
| AAF | 30 | 2 |
| AAF | 10 | 3 |
| AAF | 100 | 3 |
答案 1 :(得分:1)
您有一个“孤岛”问题。一个简单的解决方案使用行号的差异。这定义了组。
select t.*, (seqnum_2 - seqnum_1) as grp
from (select os.*,
row_number() over (partition by key order by createdate) as seqnum,
row_number() over (partition by key, stage order by createdate) as seqnum_2
from oppstages os
) os;
您可能想要的是一个聚合:
select key, stage, min(createdate), max(createdate),
lead(min(createdate)) over (partition by key, stage, seqnum - seqnum_2 order by createdate) as next_creatdate
from (select os.*,
row_number() over (partition by key order by createdate) as seqnum,
row_number() over (partition by key, stage order by createdate) as seqnum_2
from oppstages os
) os
group by key, stage, (seqnum_2 - seqnum)
我不确定您在这段时间内需要什么逻辑,但这应该包含您需要的所有信息。