我已设法计算客户是否在月度期间处于活动状态,并且在使用CTE的下一个期间(流失)中未激活。到目前为止,事实证明这是非常直接的。我的代码片段用于执行此操作(对于其他人如何执行此操作),如下所示。我的dwh.marts.fact_customer_kpi
表有记录表示客户已经active
,这意味着他/她已经花了一些钱使用服务。
with monthly_usage as (
select
userid as who_identifier,
datediff(month, '1970-01-01', date) as time_period,
date_part(mon,date) as month,
date_part(yr,date) as year,
CAST(
CAST(date_part(yr,date) AS VARCHAR(4)) +
RIGHT('0' + CAST(date_part(mon,date) AS VARCHAR(2)), 2) +
RIGHT('0' + CAST(1 AS VARCHAR(2)), 2)
AS DATETIME)as day
from dwh.marts.fact_customer_kpi as k
inner join dwh.marts.dim_user as u on u.user_id = k.userid
where
kpi = 'ACTIVE' and (datediff(month, CURRENT_DATE, registration_date)*-1) > 1 group by 1,2,3,4,5 order by 1,2,3,4,5)
,
lag_lead as (
select who_identifier,
time_period,
year,
month,
day,
lag(time_period,1) over (partition by who_identifier order by who_identifier, time_period),
lead(time_period,1) over (partition by who_identifier order by who_identifier, time_period)
from monthly_usage)
,
lag_lead_with_diffs as (
select who_identifier,
year,
month,
day,
time_period,
lag,
lead,
time_period-lag lag_size,
lead-time_period lead_size
from lag_lead)
,
calculated as (
select time_period,
year,
month,
day,
case when lag is null then 'NEW ACTIVE'
when lag_size = 1 then 'ACTIVE'
when lag_size > 1 then 'REACTIVATED'
end as this_month_value,
case when (lead_size > 1 OR lead_size IS NULL) then 'CHURN'
else NULL
end as next_month_churn,
who_identifier,
count(who_identifier) as countIdentifier
from lag_lead_with_diffs group by 1,2,3,4,5,6,7)
select time_period,
day,
this_month_value,
who_identifier,
next_month_churn,
sum(countIdentifier) as countIdentifier
from calculated group by 1,2,3,4,5
union
select time_period+1,
dateadd(month,1,day),
'CHURN',
who_identifier,
next_month_churn,
countIdentifier
from calculated where next_month_churn is not null
order by 1;
但是,现在我想知道Redshift中是否有一种有效的方法可以根据具体日期计算周期。例如,根据客户注册时的7天时间,计算上述相同值,而不是按月计算。
monthly_usage
中需要我的查询中所需的更改。我尝试使用- interval '7 days'
但到目前为止没有成功,或者我遗漏了一些东西。
有人能指出我缺少的东西(理想情况下是一个例子),或者需要进行哪些更改?
我正在使用Amazon Redshift。
答案 0 :(得分:1)
您是否缺少date_trunc功能?因为感觉就像。
你可以替换它:
CAST(
CAST(date_part(yr,date) AS VARCHAR(4)) +
RIGHT('0' + CAST(date_part(mon,date) AS VARCHAR(2)), 2) +
RIGHT('0' + CAST(1 AS VARCHAR(2)), 2)
AS DATETIME)as day
你可以这样做:
date_trunc('month', date)
然后我想用一些不错的语言对它进行参数化,并轻松换掉其他dateparts。
我可能还会为datediff(month, '1970-01-01', date)
EXTRACT