计算窗口框架上累积产品的总和

时间:2017-06-18 18:06:25

标签: sql vertica

我想估计一下我的季节性预测与实际数据的差异。我有以下数据集:

day         real_revenue    historical_coeff
01/01/2017  100             1.1
01/02/2017  105             0.98
01/03/2017  109             1.05
01/04/2017  107             1.07
01/05/2017  90              1
01/06/2017  120             0.95
01/07/2017  98              0.99

01/01/2017 revenue = 100天,季节性预测会采用一天中的系数并将其应用于当前收入。因此,它预测01/02/2017收入将为100*1.1 = 110,2017年1月1日为110*0.98 = 107.8,依此类推。然后,预测的剩余收入将是所有预测日的总和。例如,对于01/01/2017应用日间系数后,总和将为688.274235

第二天01/02/2017我们从值105开始。因此,我们预测在01/03/2017上我们会105*0.98 = 102.9,然后,对于01/04/2017,我们会预测102.9*1.05 = 108.045,依此类推。预测的剩余总收入为531.2557215

最后我想收到一张这样的表:

day         forecasted_total_remaining_revenue
01/01/2017  688.274235
01/02/2017  531.2557
01/03/2017  ...
01/04/2017  ...
01/05/2017  ...
01/06/2017  ...
01/07/2017  ...

基本上,我需要每天累计产品的总和,即a + a*b + a*b*c + a*b*c*d + ...

是否可以在vertica或sql中编写这样的查询?

3 个答案:

答案 0 :(得分:1)

您可以使用ln()exp()来获取剩余值的乘积:

select t.*,
       exp(sum(ln(historical_coeff)) over (order by day desc)) as factor
from t;

当然,如果historical_coeff为负或零,则表达式会更复杂。

然后,您可以获取此累积总和以获得总和所需的总体因子:

select t.*
       real_revenue * sum(factor) over (order by day desc) * forecasted_total_remaining_revenue
from (select t.*,
             real_revenue * exp(sum(ln(historical_coeff)) over (order by day desc)) as forecasted_total_remaining_revenue
      from t
     ) t

答案 1 :(得分:0)

在常规SQL(此处显示的语法是SQL Sever)中,可以使用递归cte(如果DBMS支持它们)来完成此操作。

with rownums as (select t.*,row_number() over(order by dt) as rn from tbl t)
,cte as (select rn,dt,real_revenue,historical_coeff,cast(real_revenue*historical_coeff as decimal(38,10)) as res 
         from rownums
         where rn=1
         union all
         select t.rn,t.dt,t.real_revenue,t.historical_coeff,cast(c.res*t.historical_coeff as decimal(38,10))
         from rownums t
         join cte c on t.rn=c.rn+1
        )
select dt,sum(res) over(order by dt desc) as forecasted_remaining_revenue
from cte 

排除最后一个系数的逻辑不明确。这总结了从给定日期到最后日期的所有累积产品。

Sample Demo

答案 2 :(得分:0)

我认为您正在寻找类似的内容(您可能需要调整间隔中的天数):

SELECT
    day,
    SUM ( frev ) OVER ( ORDER BY day
        RANGE BETWEEN CURRENT ROW AND INTERVAL '5 DAYS' FOLLOWING
    ) AS forecasted_total_remaining_revenue
FROM (
    SELECT
        day, 
        real_revenue * 
            EXP( SUM ( LN(historical_coeff)) OVER(
                ORDER BY day
                RANGE BETWEEN CURRENT ROW AND INTERVAL '5 DAYS' FOLLOWING
                )
           ) AS frev
   FROM 
       public.t1
) a 
;