下面是输入数据集的代码,我需要为它们计算最近5个月的滚动销售平均值(按每年分组)。
我正在使用数组来计算滚动平均值,但是不确定如何按年分组。
输入数据:
data test;
do mo_period = '01jan2008'd to '31dec2010'd;
year=year(mo_period);
sales = round(ranuni(1234567) * 1000, .01);
mo_period = intnx('month', mo_period, 0, 'END');
output;
end;
format mo_period monyy7. sales comma10.2;
run;
mo_period year sales
Jan-08 2008 684.94
Feb-08 2008 515.42
Mar-08 2008 894.1
Apr-08 2008 7.43
May-08 2008 129.75
Jun-08 2008 829.85
Jul-08 2008 168.86
Aug-08 2008 55.17
Sep-08 2008 61.55
Oct-08 2008 867.91
Nov-08 2008 669.72
Dec-08 2008 258.08
Jan-09 2008 454.17
Feb-09 2008 861.42
Mar-09 2008 72.81
Apr-09 2008 860.74
May-09 2008 518.59
Jun-09 2008 .
Jul-09 2008 665.9
Aug-09 2008 707.65
Sep-09 2008 133.17
Oct-09 2008 648.15
Nov-09 2008 475.26
Dec-09 2008 260.54
Jan-10 2008 774.96
Feb-11 2008 37.21
Mar-11 2008 537.34
Apr-11 2008 738.63
May-11 2008 927.31
Jun-11 2008 264.14
Jul-11 2008 145.93
Aug-11 2008 999.54
Sep-11 2008 573.12
Oct-11 2008 952.5
Nov-11 2008 194.42
Dec-11 2008 517.72
下面是计算5个月滚动平均值的代码:
%let roll_num = 5;
data new;
set test;
by year;
array summed[&roll_num] _temporary_;
if E = &roll_num then E = 1;
else E + 1;
summed[E] = sales;
if _N_ >= &roll_num then do;
roll_avg = mean(of summed[*]);
end;
format roll_sum roll_avg comma10.2;
run;
上面的代码无法计算每组年份的输出
预期输出:
mo_period year sales count AVG_SALES_R5MON
Jan-08 2008 684.94 1 .
Feb-08 2008 515.42 2 .
Mar-08 2008 894.1 3 .
Apr-08 2008 7.43 4 .
May-08 2008 129.75 5 446.328
Jun-08 2008 829.85 6 475.31
Jul-08 2008 168.86 7 405.998
Aug-08 2008 55.17 8 238.212
Sep-08 2008 61.55 9 249.036
Oct-08 2008 867.91 10 396.668
Nov-08 2008 669.72 11 364.642
Dec-08 2008 258.08 12 382.486
Jan-09 2008 454.17 1 .
Feb-09 2008 861.42 2 .
Mar-09 2008 72.81 3 .
Apr-09 2008 860.74 4 .
May-09 2008 518.59 5 553.546
Jun-09 2008 . 6 578.39
Jul-09 2008 665.9 7 529.51
Aug-09 2008 707.65 8 688.22
Sep-09 2008 133.17 9 506.3275
Oct-09 2008 648.15 10 538.7175
Nov-09 2008 475.26 11 526.026
Dec-09 2008 260.54 12 444.954
Jan-10 2008 774.96 1 .
Feb-11 2008 37.21 1 .
Mar-11 2008 537.34 2 .
Apr-11 2008 738.63 3 .
May-11 2008 927.31 4 .
Jun-11 2008 264.14 5 500.926
Jul-11 2008 145.93 6 522.67
Aug-11 2008 999.54 7 615.11
Sep-11 2008 573.12 8 582.008
Oct-11 2008 952.5 9 587.046
Nov-11 2008 194.42 10 573.102
Dec-11 2008 517.72 11 647.46
答案 0 :(得分:0)
在准备上述问题的预期输出时,我弄清楚了按年分组数据以计算5个月滚动平均值的逻辑,这意味着2009年的数据不应包括2008年的数据来计算滚动平均值。下面是代码
data new;
set test1;
array summed[&roll_num] _temporary_;
retain cnt; by year;
if first.year then cnt = 1;
else cnt=cnt + 1;
if E = &roll_num then E = 1;
else E + 1;
summed[E] = sales;
if cnt >= &roll_num then do;
roll_sum = sum(of summed[*]);
roll_avg = mean(of summed[*]);
end;
run;