3个月移动平均线 - Redshift SQL

时间:2016-03-20 22:17:46

标签: sql amazon-redshift domo

我正在尝试根据使用RedShift SQL或Domo BeastMode时的一些数据创建3个月移动平均线(如果有人熟悉的话)。

数据是日常的,但需要按月显示。 所以报价/收入需要按月汇总,然后需要计算3MMA(不包括当月)。

所以,如果报价是在4月份,我需要平均1月,2月,3月。

输入数据如下所示:

Quote Date MM/DD/YYYY     Revenue
3/24/2015                 61214
8/4/2015                  22983
9/3/2015                  30000
9/15/2015                 171300
9/30/2015                 112000

我需要输出看起来像这样:

Month               Revenue             3MMA
Jan 2015            =Sum of Jan Rev     =(Oct14 + Nov14 + Dec14) / 3
Feb 2015            =Sum of Feb Rev     =(Nov14 + Dec14 + Jan15) / 3
March 2015          =Sum of Mar Rev     =(Dec14 + Jan15 + Feb15) / 3
April 2015          =Sum of Apr Rev     =(Jan15 + Feb15 + Mar15) / 3
May 2015            =Sum of May Rev     =(Feb15 + Mar15 + Apr15) / 3

如果有人能够提供帮助,我将非常感激!我已经坚持了很长一段时间,并且不知道我在做什么,当涉及到SQL lol。

干杯, 洛根。

3 个答案:

答案 0 :(得分:1)

您可以使用聚合和窗口函数来执行此操作:

select date_trunc('month', quotedate) as mon,
       sum(revenue) as mon_revenue,
       avg(sum(revenue)) over (order by date_trunc('month', quotedate)  rows between 2 preceding and current row) as revenue_3mon
from t
group by date_trunc('month', quotedate) 
order by mon;

注意:这使用平均值,因此对于第一行和第二行,它将分别除以1和2。它还假设您每个月至少有一条记录。

编辑:

我想知道在RedShift中是否存在聚合函数与分析函数混合的问题。以下是更好的:

select m.*,
       avg(mon_revenue) over (order by mon rows between 2 preceding and current row) as revenue_3mon
from (select date_trunc('month', quotedate) as mon,
             sum(revenue) as mon_revenue
      from t
      group by date_trunc('month', quotedate) 
     ) m
order by mon;

答案 1 :(得分:0)

您不能一起使用聚合函数和分析函数 查询应该是

select m.*,
       avg(mon_revenue) over (order by mon rows between 3 preceding and 1 preceding) as revenue_3mon -- using 3 preceding and 1 preceding row you exclude the current row
from (select date_trunc('month', quotedate) as mon,
             sum(revenue) as mon_revenue
      from t
      group by date_trunc('month', quotedate) 
     ) m
order by mon;

前面3个和前面1个之间的行(应该删除最后的行,否则redshift将不起作用)

答案 2 :(得分:0)

你可以采取类似我们为滚动6周创建存储桶的方式(日期列被称为"日期"):

case 
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,1,current_date)) then 'CW'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-6,current_date)) then 'LW'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-13,current_date)) then '2W'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-20,current_date)) then '3W'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-27,current_date)) then '4W'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-34,current_date)) then '5W'
    when date_trunc('week',dateadd(day,1,date)) = date_trunc('week',dateadd(day,-41,current_date)) then '6W'  
end as dateweek

然后,您可以在数据流的后续步骤中创建平均值...