SAS:按年份分组计算滚动5个月的平均销售量

时间:2019-10-24 08:32:46

标签: arrays sas analytics

下面是输入数据集的代码,我需要为它们计算最近5个月的滚动销售平均值(按每年分组)。

我正在使用数组来计算滚动平均值,但是不确定如何按年分组。

输入数据:

data test;
do mo_period = '01jan2008'd to '31dec2010'd; 
    year=year(mo_period);
      sales = round(ranuni(1234567) * 1000, .01);
      mo_period = intnx('month', mo_period, 0, 'END');
      output;
   end;
   format mo_period monyy7. sales comma10.2;
run;
mo_period   year    sales
Jan-08  2008    684.94
Feb-08  2008    515.42
Mar-08  2008    894.1
Apr-08  2008    7.43
May-08  2008    129.75
Jun-08  2008    829.85
Jul-08  2008    168.86
Aug-08  2008    55.17
Sep-08  2008    61.55
Oct-08  2008    867.91
Nov-08  2008    669.72
Dec-08  2008    258.08
Jan-09  2008    454.17
Feb-09  2008    861.42
Mar-09  2008    72.81
Apr-09  2008    860.74
May-09  2008    518.59
Jun-09  2008    .
Jul-09  2008    665.9
Aug-09  2008    707.65
Sep-09  2008    133.17
Oct-09  2008    648.15
Nov-09  2008    475.26
Dec-09  2008    260.54
Jan-10  2008    774.96
Feb-11  2008    37.21
Mar-11  2008    537.34
Apr-11  2008    738.63
May-11  2008    927.31
Jun-11  2008    264.14
Jul-11  2008    145.93
Aug-11  2008    999.54
Sep-11  2008    573.12
Oct-11  2008    952.5
Nov-11  2008    194.42
Dec-11  2008    517.72

下面是计算5个月滚动平均值的代码:

%let roll_num = 5;

data new;
   set test;
by year;
array summed[&roll_num] _temporary_;
   if E = &roll_num then E = 1;
   else E + 1;
   summed[E] = sales;
   if _N_ >= &roll_num then do;
      roll_avg = mean(of summed[*]);
   end;
   format roll_sum roll_avg comma10.2;
run;

上面的代码无法计算每组年份的输出

预期输出:

mo_period   year    sales   count   AVG_SALES_R5MON
Jan-08  2008    684.94  1   .
Feb-08  2008    515.42  2   .
Mar-08  2008    894.1   3   .
Apr-08  2008    7.43    4   .
May-08  2008    129.75  5   446.328
Jun-08  2008    829.85  6   475.31
Jul-08  2008    168.86  7   405.998
Aug-08  2008    55.17   8   238.212
Sep-08  2008    61.55   9   249.036
Oct-08  2008    867.91  10  396.668
Nov-08  2008    669.72  11  364.642
Dec-08  2008    258.08  12  382.486
Jan-09  2008    454.17  1   .
Feb-09  2008    861.42  2   .
Mar-09  2008    72.81   3   .
Apr-09  2008    860.74  4   .
May-09  2008    518.59  5   553.546
Jun-09  2008    .   6   578.39
Jul-09  2008    665.9   7   529.51
Aug-09  2008    707.65  8   688.22
Sep-09  2008    133.17  9   506.3275
Oct-09  2008    648.15  10  538.7175
Nov-09  2008    475.26  11  526.026
Dec-09  2008    260.54  12  444.954
Jan-10  2008    774.96  1   .
Feb-11  2008    37.21   1   .
Mar-11  2008    537.34  2   .
Apr-11  2008    738.63  3   .
May-11  2008    927.31  4   .
Jun-11  2008    264.14  5   500.926
Jul-11  2008    145.93  6   522.67
Aug-11  2008    999.54  7   615.11
Sep-11  2008    573.12  8   582.008
Oct-11  2008    952.5   9   587.046
Nov-11  2008    194.42  10  573.102
Dec-11  2008    517.72  11  647.46

1 个答案:

答案 0 :(得分:0)

在准备上述问题的预期输出时,我弄清楚了按年分组数据以计算5个月滚动平均值的逻辑,这意味着2009年的数据不应包括2008年的数据来计算滚动平均值。下面是代码

data new;
   set test1;
   array summed[&roll_num] _temporary_;
    retain cnt; by year;
    if first.year then cnt = 1;
     else cnt=cnt + 1;
    if E = &roll_num then E = 1;
     else E + 1;
   summed[E] = sales;
   if cnt >= &roll_num then do;
      roll_sum = sum(of summed[*]);
      roll_avg = mean(of summed[*]);
   end;
   run;