有条件的状态计算时间

时间:2018-02-03 16:04:47

标签: sql amazon-redshift

我正在尝试计算代表有多少时间让x客户申请服务:这意味着我需要date_created之间的时间 - 即。约会被加入的日期,以及当代表达到某个“状态”时。当rep的客户端(=用户)的x具有非空date_applied时,即达到状态 - 即。约会用户注册。

x是达到每个“状态”的最低标准,并且与之前的问题相关联:Aggregate case when inside non aggregate query我正在计算“状态”,如下所示:

  case when count(date_applied) over (partition by rep_id) >=10 then 'status1'
    when count(date_applied) over (partition by rep_id) >=5 then 'status2'
    when count(date_applied) over (partition by rep_id) >=1 then 'status3'
    else 'no_status' end status

因此,需要10个客户才能达到status1,5个达到status2,1达到status3。这些是每个“状态”的标准,因此,如果您有7个用户,则仍然会根据第5个用户应用的日期计算status2

我认为计算time_to_status1/2/3(我想要得到的)应该是这样的:

case when count(date_applied) over (partition by rep_id) >=10 then
  datediff(day, date_created, date_applied for the 10th user that applied with that rep) end as time_to_status1,
case when count(date_applied) over (partition by rep_id) >=5 then
  datediff(day, date_created, date_applied for the 5th user that applied with that rep) end as time_to_status2,
case when count(date_applied) over (partition by rep_id) >=1 then
  datediff(day, date_created, date_applied for the 1st user that applied with that rep) end as time_to_status3

非常感谢任何帮助!

- 编辑 -

示例当前数据:

rep_id user_id date_created          date_applied         status
1      1       1/1/2018 6:43:22 AM   1/5/2018 2:45:15 PM  status2 
1      2       1/1/2018 6:43:22 AM   1/5/2018 3:35:15 PM  status2 
1      3       1/1/2018 6:43:22 AM   1/6/2018 4:25:15 PM  status2 
1      4       1/1/2018 6:43:22 AM   1/7/2018 5:05:15 PM  status2 
1      5       1/1/2018 6:43:22 AM   1/10/2018 3:35:15 PM status2 
1      6       1/1/2018 6:43:22 AM   1/15/2018 12:55:23 PM status2 
2      7       1/12/2018 1:13:42 PM  1/15/2018 4:25:15 PM status3
2      8       1/12/2018 1:13:42 PM  1/16/2018 1:05:15 PM status3 
2      9       1/12/2018 1:13:42 PM  1/16/2018 3:35:15 PM status3 
3      10      1/20/2018 10:13:15 AM 1/26/2018 7:25:15 PM status3
4      11      1/21/2018 3:33:23 PM  (null)               no_status   

期望的输出:

 rep_id user_id date_created          date_applied         status  time_to_status1  time_to_status2  time_to_status3
1      1       1/1/2018 6:43:22 AM   1/5/2018 2:45:15 PM  status2  (null)  9  (null)
1      2       1/1/2018 6:43:22 AM   1/5/2018 3:35:15 PM  status2  (null)  9  (null)
1      3       1/1/2018 6:43:22 AM   1/6/2018 4:25:15 PM  status2  (null)  9  (null)
1      4       1/1/2018 6:43:22 AM   1/7/2018 5:05:15 PM  status2  (null)  9  (null)
1      5       1/1/2018 6:43:22 AM   1/10/2018 3:35:15 PM status2  (null)  9  (null)
1      6       1/1/2018 6:43:22 AM   1/15/2018 12:55:23 PM status2  (null)  9 (null)
2      7       1/12/2018 1:13:42 PM  1/15/2018 4:25:15 PM status3  (null) (null) 3
2      8       1/12/2018 1:13:42 PM  1/16/2018 1:05:15 PM status3  (null) (null) 3
2      9       1/12/2018 1:13:42 PM  1/16/2018 3:35:15 PM status3  (null) (null) 3
3      10      1/20/2018 10:13:15 AM 1/26/2018 7:25:15 PM status3 (null) (null) 6
4      11      1/21/2018 3:33:23 PM  (null)               no_status (null) (null) (null)

rep_id=1status2,因为他有6个用户使用非空date_applied,因此time_to_status2在他的情况下基于第5个客户的date_applied代表注册:datediff(day, '1/1/2018 6:43:22 AM', '1/10/2018 3:35:15 PM') = 9

rep_id=2status3,因为他有3个非空date_applied的用户,因此time_to_status3在他的情况下基于第一个客户代表的date_applied已注册:datediff(day, '1/12/2018 1:13:42 PM', '1/15/2018 4:25:15 PM') = 3

rep_id=3status3,因为他有1个(> = 1)非空date_applied的用户,因此time_to_status3就是datediff(day, '1/20/2018 10:13:15 AM', '1/26/2018 7:25:15 PM') = 6

1 个答案:

答案 0 :(得分:0)

基于@ Parfait删除的提示,以及@ Gordon对另一个问题的回答,我能够得出答案:

with cte as 
(
initial query with:
 case when count(client_signup_date) over (partition by rep_id) >=10 then 'status1'
        when count(client_signup_date) over (partition by rep_id) >=5 then 'status2'
        when count(client_signup_date) over (partition by rep_id) >=1 then 'status3'
        else 'none' end status,
      row_number() over(partition by rep_id order by client_signup_date) as rank
)

select *, 
        max(case when status = 'status1' and rank = 10
                 then datediff(day, advisor_onboard_date, client_signup_date)
            end) over (partition by rep_id) as time_to_status1,
        max(case when status = 'status2' and rank = 5
                 then datediff(day, advisor_onboard_date, client_signup_date)
            end) over (partition by rep_id) as time_to_status2,
        max(case when status = 'status3' and rank = 1
                 then datediff(day, advisor_onboard_date, client_signup_date)
            end) over (partition by rep_id) as time_to_status3
into #t
from cte