Question

我有一个包含以下信息的表：

id | amount |   date   | customer_id
 1 |  0.00  | 11/12/17 | 1
 2 | 54.00  | 11/12/17 | 1
 3 | 60.00  | 02/12/18 | 1
 4 |  0.00  | 01/18/17 | 2
 5 | 14.00  | 03/12/17 | 2
 6 | 24.00  | 02/22/18 | 2
 7 |  0.00  | 09/12/16 | 3
 8 | 74.00  | 10/01/17 | 3

我需要它看起来如下：

ranked_id | id | amount |   date   | customer_id
        1 |  1 |  0.00  | 11/12/17 | 1
        2 |  2 | 54.00  | 11/12/17 | 1            
        3 |  3 | 60.00  | 02/12/18 | 1
        4 |  3 | 60.00  | 02/12/18 | 1
        5 |  3 | 60.00  | 02/12/18 | 1
        6 |  3 | 60.00  | 02/12/18 | 1
        7 |  3 | 60.00  | 02/12/18 | 1
        8 |  4 |  0.00  | 01/18/17 | 2
        9 |  5 | 14.00  | 03/12/17 | 2
       10 |  6 | 24.00  | 02/22/18 | 2
       11 |  6 | 24.00  | 02/22/18 | 2
       12 |  6 | 24.00  | 02/22/18 | 2
       13 |  6 | 24.00  | 02/22/18 | 2
       14 |  6 | 24.00  | 02/22/18 | 2
       15 |  7 |  0.00  | 09/12/16 | 3
       16 |  8 | 74.00  | 10/01/17 | 3
       17 |  8 | 74.00  | 10/01/17 | 3
       18 |  8 | 74.00  | 10/01/17 | 3
       19 |  8 | 74.00  | 10/01/17 | 3
       20 |  8 | 74.00  | 10/01/17 | 3
       21 |  8 | 74.00  | 10/01/17 | 3

我知道有分区和排名的东西（在ranking_id上），但我无法弄清楚如何重复最后一行7次。

Answer 1

在Postgres中，您可以使用generate_series()和交叉连接来生成所有行。然后你可以选择你想要的那个：

select row_number() over (order by customer_id, id) as ranking_id,
       coalesce(t.id, cid) as id, coalesce(t.amount, c.amount) as amount
       coalesce(t.date, c.date) as date, t.customer_id
from (select distinct on (customer_id) t.*
      from t
      order by customer_id, date desc
     ) c cross join
     generate_series(1, 7) g(i) left join
     (select t.*, row_number() over (partition by customer_id order by date) as i
      from t
     ) t
     on t.customer_id = c.customer_id and t.i = g.i;

Answer 2

正如@Gordon Linoff建议您可以使用与不同customer_ids交叉的generate_series（）函数来生成所需的所有行，如下面的T1所示。然后在T2（也在下面）中，row_number函数用于生成从t1连同customer_id的外连接的顺序值。

从那里开始，当没有原始数据要加入到case语句和分析first_value函数的位置时，能够获得每个customer_id的最后一个值。我无法获得last_value分析函数可能由于postgresql缺少ignore nulls指令而工作，因此我使用first_Value降序排序，只在没有其他数据时返回分析值。

with t1 as (
select distinct 
       dense_rank() over (order by customer_id, generate_series) ranked_id
     , customer_id
     , generate_series
  from table1
  cross join generate_series(1,7)
), t2 as (
  select row_number() over (partition by customer_id order by id) rn
       , table1.*
    from table1
)
select t1.ranked_id
     , case when t2.customer_id is not null
            then t2.id
            else  first_value(t2.id)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end id
     , case when t2.customer_id is not null
            then t2.amount
            else  first_value(t2.amount)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end amount
     , case when t2.customer_id is not null
            then t2.date
            else  first_value(t2.date)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end date
     , t1.customer_id
  from t1
  left join t2
    on t2.customer_id = t1.customer_id
   and t2.id = t1.generate_series
 order by ranked_id;

这是展示代码的SQL Fiddle。

用最后结果填写表格

2 个答案: