总和最高连续发生

时间:2017-09-08 20:50:11

标签: mysql sql analytics

我有一个包含三列(lending_id int, installment_n serial int, status text)的表格,我想知道如何为每个lending_id检索WAITING_PAYMENT (status)的最大差距。

对于以下示例:

lending_id | installment_n | status
71737   1    PAID
71737   2    PAID
71737   3    PAID
71737   4    PAID
71737   5    PAID
71737   6    WAITING_PAYMENT
71737   7    WAITING_PAYMENT
71737   8    WAITING_PAYMENT
71737   9    WAITING_PAYMENT
71737   10   WAITING_PAYMENT
71737   11   WAITING_PAYMENT
71737   12   WAITING_PAYMENT
71737   13   WAITING_PAYMENT
71737   14   WAITING_PAYMENT
71737   15   WAITING_PAYMENT
71737   16   WAITING_PAYMENT
71737   17   WAITING_PAYMENT
71737   18   WAITING_PAYMENT
71737   19   WAITING_PAYMENT
71737   20   WAITING_PAYMENT
71737   21   WAITING_PAYMENT
354226  1    PAID
354226  2    PAID
354226  3    WAITING_PAYMENT
354226  4    WAITING_PAYMENT
354226  5    WAITING_PAYMENT
354226  6    WAITING_PAYMENT
354226  7    PAID
354226  8    WAITING_PAYMENT
354226  9    WAITING_PAYMENT
354226  10   WAITING_PAYMENT
354226  11   WAITING_PAYMENT
354226  12   WAITING_PAYMENT
354226  13   WAITING_PAYMENT
354226  14   WAITING_PAYMENT
354226  15   WAITING_PAYMENT

我想知道如何检索:

lending_id | count
71737      | 16
354226     | 8

自71737起,它将从第6至21(16)期考虑 对于354226,8和15(8)之间的差距。

3 个答案:

答案 0 :(得分:2)

您可以使用相关子查询和一些其他逻辑:

select lending_id, max(cnt)
from (select lending_id, t.next_in, count(*) as cnt
      from (select t.*,
                   (select min(t2.installment_n)
                    from t t2
                    where t2.lending_id = t.lending_id and t2.installment_n > t.installment_n and
                          t2.status <> 'WAITING_PAYMENT'
                   ) as next_in
            from t 
            where t.status = 'WAITING_PAYMENT'
           ) t
      group by lending_id, t.next_in
     ) lt
group by lending_id;

这是如何工作的?最里面的子查询得到的下一个分期编号不是WAITING_PAYMENT - 或NULL,如果没有。这标识了所有顺序WAITING_PAYMENT记录组。

中间子查询计算每个组中的数字。外部查询占用最大值。

答案 1 :(得分:1)

下面的SQL应该可以做到这一点,并且易于阅读和理解时尚:

select t1.lending_id, max(t1.installment_n) - min(t1.installment_n) as count
from table t1
where t1.status = 'WAITING_PAYMENT'
and t1.installment_n > 
  (SELECT max(t2.installment_n) FROM table t2 where t2.lending_id = t1.lending_id and t2.status = 'PAID')
group by lending_id;

如有任何进一步的澄清,请不要犹豫,问我。

泰德。

答案 2 :(得分:1)

这是一种基于模仿row_number()的方法,该方法适用于不支持窗口函数的MySQL版本(计划将窗口函数包含在MySQL v8.x中)。

这种方法的结果将揭示关于最长序列的更多事实而不仅仅是计数。有关详细信息,请参阅下面的结果。

SQL Fiddle

MySQL 5.6架构设置

CREATE TABLE Table1
    (`lending_id` int, `installment_n` int, `status` varchar(15))
;

INSERT INTO Table1
    (`lending_id`, `installment_n`, `status`)
VALUES
    (71737, 1, 'PAID'),
    (71737, 2, 'PAID'),
    (71737, 3, 'PAID'),
    (71737, 4, 'PAID'),
    (71737, 5, 'PAID'),
    (71737, 6, 'WAITING_PAYMENT'),
    (71737, 7, 'WAITING_PAYMENT'),
    (71737, 8, 'WAITING_PAYMENT'),
    (71737, 9, 'WAITING_PAYMENT'),
    (71737, 10, 'WAITING_PAYMENT'),
    (71737, 11, 'WAITING_PAYMENT'),
    (71737, 12, 'WAITING_PAYMENT'),
    (71737, 13, 'WAITING_PAYMENT'),
    (71737, 14, 'WAITING_PAYMENT'),
    (71737, 15, 'WAITING_PAYMENT'),
    (71737, 16, 'WAITING_PAYMENT'),
    (71737, 17, 'WAITING_PAYMENT'),
    (71737, 18, 'WAITING_PAYMENT'),
    (71737, 19, 'WAITING_PAYMENT'),
    (71737, 20, 'WAITING_PAYMENT'),
    (71737, 21, 'WAITING_PAYMENT'),
    (354226, 1, 'PAID'),
    (354226, 2, 'PAID'),
    (354226, 3, 'WAITING_PAYMENT'),
    (354226, 4, 'WAITING_PAYMENT'),
    (354226, 5, 'WAITING_PAYMENT'),
    (354226, 6, 'WAITING_PAYMENT'),
    (354226, 7, 'PAID'),
    (354226, 8, 'WAITING_PAYMENT'),
    (354226, 9, 'WAITING_PAYMENT'),
    (354226, 10, 'WAITING_PAYMENT'),
    (354226, 11, 'WAITING_PAYMENT'),
    (354226, 12, 'WAITING_PAYMENT'),
    (354226, 13, 'WAITING_PAYMENT'),
    (354226, 14, 'WAITING_PAYMENT'),
    (354226, 15, 'WAITING_PAYMENT')
;

查询1

select lending_id, status, start_at_inst, end_at_inst, inst_count
from (
      select IF(@prev_value=lending_id, @rn:=@rn+1 , @rn:=1) AS rn
            , lending_id, status, start_at_inst, end_at_inst, inst_count
            , @prev_value := lending_id z
      from (
           select lending_id
                   , status
                   , grpby
                   , min(installment_n) start_at_inst
                   , max(installment_n) end_at_inst
                   , (max(installment_n) + 1) - min(installment_n) inst_count
            from (
                 select
                        IF(@prev_value=concat_ws(',',lending_id,status), @rn:=@rn+1 , @rn:=1) AS rn
                      , t.*
                      , installment_n - @rn grpby
                      , @prev_value := concat_ws(',',lending_id,status) z
                 from Table1 t
                 cross join (
                     select @rn := 0 , @prev_value := ''
                     ) vars
                 order by lending_id, status,installment_n ASC
                 ) d1
            group by lending_id, status, grpby
          ) d2
      cross join (
          select @rn := 0 , @prev_value := ''
          ) vars
      order by lending_id, inst_count DESC
     ) d3
where rn = 1

<强> Results

| lending_id |          status | start_at_inst | end_at_inst | inst_count |
|------------|-----------------|---------------|-------------|------------|
|     354226 | WAITING_PAYMENT |             8 |          15 |          8 |
|      71737 | WAITING_PAYMENT |             6 |          21 |         16 |

虽然在V8.x的MySQL处于生产版本之前你不能使用row_number();但是对于db已经支持它的用户,以及可用的MySQL用户,使用row_number()的方法与使用row_number()相同,我认为它比@variable方法更有效。

select
       lending_id, status, start_at_inst, end_at_inst, inst_count
from (
select 
       lending_id
       , status
       , grpby
       , min(installment_n) start_at_inst
       , max(installment_n) end_at_inst
       , (max(installment_n) + 1) - min(installment_n) inst_count
       , row_number() over(partition by lending_id order by (max(installment_n) + 1) - min(installment_n) DESC) rn
from (
     select
            t.*
          , installment_n - row_number() over(partition by lending_id, status order by installment_n) grpby
     from Table1 t
     ) d1
group by
       lending_id, status, grpby
    ) d2
where rn = 1
;

<强>结果:

 lending_id | status          | start_at_inst | end_at_inst | inst_count
 ---------: | :-------------- | ------------: | ----------: | ---------:
      71737 | WAITING_PAYMENT |             6 |          21 |         16
     354226 | WAITING_PAYMENT |             8 |          15 |          8

dbfiddle(mariadb_10.2)here