随时间改变付款频率的客户百分比

时间:2018-08-17 09:55:13

标签: sql postgresql

我有下表

      custmoer_id   period_type  starts_on    ends_on 
       1            Monthly     06/01/2018   07/01/2018
       2           FourWeekly   01/05/2018   01/06/2018
       3           BiAnnually   02/06/2018   02/12/2018
       4           Fortnightly  06/04/2017   06/05/2018
       5           Weekly       18/07/2018   25/07/2018
       6           Quarterly    14/10/2017   14/10/2018
       7           Annually     04/01/2017   04/01/2018
       1           Fortnightly  01/04/2018   01/05/2018
       2           BiAnnually   30/09/2016   30/03/2018
       3           Weekly       01/04/2018   01/06/2018
       4           FourWeekly   06/03/2017   06/04/2018
       5           Monthly      18/06/2018   18/07/2018
       6           Annually     14/10/2016   14/10/2017
       7           Monthly      03/01/2017   04/01/2017

starts_on和ends_on是支付的期间 自2015年以来,每个客户都有一条付款记录(数百条记录),其中一些或大多数人/已经更改了其期间类型(付款频率) 我正在尝试查找更改付款频率的客户百分比,希望能看到类似

        year    percentage  switch from       witched to 
         2015     20%         weekly          monthly
         2015     50%         Monthly         Fortnightly
         2015     30%         FourWeekly      Annually
         2016     20%         weekly          Annually
         2016     50%         Monthly         Fortnightly
         2016     30%         FourWeekly      monthly
         2017     20%         weekly          Annually
         2017     50%         Monthly         Fortnightly
         2017     30%         FourWeekly      monthly
         2018     20%         weekly          monthly
         2018     50%         Monthly         Annually
         2018     30%         FourWeekly      Annually

我尝试了几种方法来解决问题,但实际上都没有 因为我无法选择类别之间的动态

2 个答案:

答案 0 :(得分:1)

SQL提琴:http://sqlfiddle.com/#!17/e5ed4/1

create table customers (
        customer_id int
    );
    insert into customers (customer_id) values (1), (2), (3), (4), (5), (6), (7);

create table subscriptions (
    customer_id int,
    period_type varchar(50),
    starts_on date, 
    ends_on date 
);

insert into subscriptions (customer_id, period_type, starts_on, ends_on) values
(1, 'Monthly', '2018-01-06', '2018-01-07'),
(1, 'Fortnightly', '2018-04-01', '2018-05-01'),
(1, 'Monthly', '2018-05-01', '2018-06-01'),

(2, 'Monthly', '2016-09-30', '2018-03-30'),
(2, 'Fortnightly', '2018-05-01', '2018-06-01'),

(3, 'BiAnnually', '2017-09-30', '2019-03-30'),
(3, 'Weekly', '2018-04-01', '2018-06-01'),

(4, 'FourWeekly', '2017-03-06', '2017-04-06'),
(4, 'Fortnightly', '2017-04-06', '2018-05-06'),

(5, 'Monthly', '2017-06-18', '2018-07-18'),
(5, 'Weekly', '2018-07-18', '2018-07-25'),

(6, 'Annually', '2016-10-14', '2017-10-14'),
(6, 'Quarterly', '2017-10-14', '2018-10-14'),

(7, 'Monthly', '2017-01-03', '2017-01-04'),
(7, 'Annually', '2017-04-01', '2018-04-01')

解决方法:

select to_char(COUNT( distinct customer_id)::numeric / (select COUNT(*) from customers) * 100, '999D99%') , to_char( s2_start, 'YYYY') as year, s1_period, s2_period from
(
    select s1.customer_id, s1.period_type as s1_period, s1.starts_on as s1_start, s1.ends_on as s1_end, s2.period_type as s2_period, s2.starts_on as s2_start, s2.ends_on as s2_end, MIN(s2.starts_on) over (PARTITION BY s1.customer_id, s1.period_type, s1.starts_on) as s2_min from subscriptions s1
    inner join subscriptions s2 on s1.customer_id = s2.customer_id and s1.period_type != s2.period_type
    and s1.ends_on <= s2.starts_on
) t 
where t.s2_start = t.s2_min
group by year, s1_period, s2_period

答案 1 :(得分:0)

我会以不同的方式处理这个问题。以下将在一年中获取 net 更改,而不是捕获所有更改。这通常介于“足够好”和“真正需要什么”之间。

select extract(year from gs.yyyymmdd) as yyyy,
       t.period_type, t.next_period_type,
       count(*) as num_customers,
       count(*) / sum(count(*)) over () as ratio
from (select gs.yyyymmdd, t.period_type,
             lead(t.period_type) over (partition by t.customer_id order by gs.yyyymmdd) as next_period_type
      from generate_series('2016-01-01'::date, '2019-01-01'::date, interval '1 year') gs(yyyymmdd) join
           customer t
           on t.yyyymmdd >= t.starts_on and t.yyyymmdd <= t.ends_on
     ) t
group by gs.yyyymmdd