我对保留率有疑问。
我有2个表,包括客户数据和订单数据。
DISTRIBUTOR as d +---------+-----------+--------------+--------------------+ | ID | SETUP_DT | REINSTATE_DT | LOCAL_REINSTATE_DT | +---------+-----------+--------------+--------------------+ | C111111 | 2018/1/1 | Null | Null | | C111112 | 2015/12/9 | 2018/10/25 | 2018/10/25 | | C111113 | 2018/10/1 | Null | Null | | C111114 | 2018/10/6 | 2018/12/14 | 2018/12/14 | +---------+-----------+--------------+--------------------+
ORDER as o, please noted that the data is for reference... +---------+----------+-----+ | ID | ORD_DT | OAL | +---------+----------+-----+ | C111111 | 2018/1/1 | 112 | | C111111 | 2018/1/1 | 100 | | C111111 | 2018/1/1 | 472 | | C111111 | 2018/1/1 | 452 | | C111111 | 2018/1/1 | 248 | | C111111 | 2018/1/1 | 996 | +---------+----------+-----+
The 3rd Table in my mind to create the retention rate report +---------+-----------+-----------+---------------+-----------+ | ID | APP_MON | ORDER_MON | TimeDiff(Mon) | TTL AMT | +---------+-----------+-----------+---------------+-----------+ | C111111 | 2018/1/1 | 2018/1/1 | - | 25,443 | | C111111 | 2018/1/1 | 2018/2/1 | 1 | 7,610 | | C111111 | 2018/1/1 | 2018/3/1 | 2 | 20,180 | | C111111 | 2018/1/1 | 2018/4/1 | 3 | 22,265 | | C111111 | 2018/1/1 | 2018/5/1 | 4 | 34,118 | | C111111 | 2018/1/1 | 2018/6/1 | 5 | 19,523 | | C111111 | 2018/1/1 | 2018/7/1 | 6 | 20,220 | | C111111 | 2018/1/1 | 2018/8/1 | 7 | 2,006 | | C111111 | 2018/1/1 | 2018/9/1 | 8 | 15,813 | | C111111 | 2018/1/1 | 2018/10/1 | 9 | 16,733 | | C111111 | 2018/1/1 | 2018/11/1 | 10 | 20,973 | | C111112 | 2018/10/1 | 2017/11/1 | - | 516 | | C111112 | 2018/10/1 | 2018/10/1 | - | 1 | | C111113 | 2018/10/1 | Null | - | Null | | C111114 | 2018/12/1 | Null | - | Null | +---------+-----------+-----------+---------------+-----------+ Definition: - APP_MON: the month that the customer joined, which is the max date from the start date of [d.SETUP_DT], [d.REINSTATE_DT] and [d.LOCAL_REINSTATE_DT] - ORD_MON: the month that the customer purchased, which is the start date of the order date month - TimeDiff: The duration by month between APP_MON and ORD_MON, e.g. if A's ODR_MON is 2018/1/1 and A'S APP_MON is 2018/2/1, the duration is 1. - TTL_AMT: the total order amount that the customer bought in the related order date month
我试图从第3个表中获取数据。 但是我运行下面的代码,它非常慢...我需要一种更有效的方法,因为我有数百万个数据... 谢谢。
答案 0 :(得分:2)
我认为您不需要使用unpivot
。要获取最新日期,您可以使用greatest()
函数。
此解决方案有两个子查询,一个子查询为每个新客户计算app_mon
,另一个子查询为最近两年下订单的所有客户计算最早的订货日期。这可能不是最有效的方法,但您的首要任务应该是获得正确的结果。拥有后,您可以根据需要对其进行调整:
with cust as
(
select d.dist_id as id
, greatest(d.setup_dt, d.reinstate_dt, d.local_reinstate_dt) as app_mo
from mjensen_dev.gc_distributor d
where d.setup_dt >= date '2017-01-01'
or d.reinstate_dt >= date '2017-01-01'
or d.local_reinstate_dt >= date '2017-01-01'
) , ord as
(
select o.dist_id as id
, min(o.ord_dt) as ord_mon
, sum(o.oal) as ord_amt
from gc_orders o
where o.ord_dt >= date '2017-01-01'
group by o.dist_id
, trunc(o.ord_dt,'mm')
)
select cust.dist_id as id
, cust.app_mon
, ord.ord_mon
, floor(months_between(ord.ord_mon, cust.app_mon ) as mon_diff
, sum(o.oal) as ord_amt
from cust
inner join gc_orders o on cust.id = o.dist_id
order by 1, 2
/
您可能希望调整我对mon_diff
的计算。此计算将2018/2/1-2018/1/1视为一个月差额。因为对我来说奇怪的是,在加入之日下订单的客户的mon_diff
为1而不是零。但是,如果您的业务规则陈述正确,则需要在计算中加1。同样,在日期处理中我没有包含trunc()
,但您可能希望恢复它。