Redshift中的递归总和

时间:2014-06-25 00:33:22

标签: sql postgresql amazon-redshift

CREATE TABLE testit (
  id INT, v1 INT, v2 INT, result INT);

INSERT
   INTO testit (id, v1, v2, result)
   VALUES 
     (1, 1,     2, 1 )
   , (2, 4,     3, 4 )
   , (3, 6,     7, 6 )
   , (4, NULL, 10, 13)
   , (5, NULL, 12, 25)
;

鉴于前三列id,v1,v2,我想写一个返回'result'列的查询:

  • v1如果v1不为空
  • v1和v2 ig v1的前面行的(递归)总和为空(或者:v1的最后一个值和v1为空的第一行与前一行之间的v2之和)​​

这可能吗? SQLFiddle link

2 个答案:

答案 0 :(得分:1)

以下查询可以获得所需的结果。 3个不同的查询返回由UNION ALL连接在一起的以下结果集:

如果当前行的v1不为空

如果当前行的v1为null且前一行的v1不为空

如果当前行的v1为null且前一行的v1为空

select t_main.id, t_main.v1, t_main.v2, results.result
from 
testit t_main
inner join
(
  select id, result
  from testit
  where v1 is not null
  union all
  select t1.id, max(t2.v1+t2.v2) sum_result
  from testit t1 
  inner join testit t2 on t2.id = t1.id-1 and t2.v1 is not null
  where t1.v1 is null
  group by t1.id
  union all
  select
    to1.id, max(to3.v1+to3.v2+to1.v2)
  from testit to1
  inner join testit to2 on to2.id = to1.id-1 and to2.v1 is null 
  inner join 
  (
    select t1.id t1_id, max(t3.id) t3_id
    from testit t1 
    inner join testit t2 on t2.id = t1.id-1 and t2.v1 is null
    inner join testit t3 on t3.id < t1.id and t3.v1 is not null
    where t1.v1 is null
    group by t1.id
  ) max_id on to1.id = max_id.t1_id
  inner join testit to3 on max_id.t3_id = to3.id
  group by to1.id
) results
on t_main.id = results.id
order by t_main.id;

性能方面,这个查询可能不是最好的方法,因为有很多自联接,但也有很多业务规则。

SQL Fiddle

答案 1 :(得分:0)

SQL表达式为:

select ti.*,
       sum(coalesce(v1, c2)) over (order by id)
from testit ti;

我并非100%确定Redshift支持累积总和而没有rangerows选项。所以这可能是:

select ti.*,
       sum(coalesce(v1, c2)) over (order by id range between unbounded preceding and current row)
from testit ti;

或:

select ti.*,
       sum(coalesce(v1, c2)) over (order by id rows between unbounded preceding and current row)
from testit ti;
道歉。道歉。 。 。我现在无法访问RedShift。关于它接受Windows函数的语法有时很挑剔。但这三个中的一个应该有用。