如何用sas或sql中的组平均值替换0?

时间:2017-04-12 08:20:34

标签: sql sas

我想用特定月份的平均值替换所有0.0

value   date      month year
33.2    01SEP2016   9   2016
33.7    02SEP2016   9   2016
34.8    03SEP2016   9   2016
33.8    04SEP2016   9   2016
33.7    05SEP2016   9   2016
33.8    06SEP2016   9   2016
32.7    07SEP2016   9   2016
33.4    08SEP2016   9   2016
32.5    09SEP2016   9   2016
33.7    10SEP2016   9   2016
32.7    11SEP2016   9   2016
32.5    12SEP2016   9   2016
32.1    13SEP2016   9   2016
32.2    14SEP2016   9   2016
32.0    15SEP2016   9   2016
31.8    16SEP2016   9   2016
31.8    17SEP2016   9   2016
31.9    18SEP2016   9   2016
32.5    19SEP2016   9   2016
32.5    20SEP2016   9   2016
32.3    21SEP2016   9   2016
32.6    22SEP2016   9   2016
14.2    23SEP2016   9   2016
0.0     24SEP2016   9   2016
0.0     25SEP2016   9   2016
0.0     26SEP2016   9   2016
0.0     27SEP2016   9   2016
0.0     28SEP2016   9   2016
0.0     29SEP2016   9   2016
0.0     30SEP2016   9   2016

2 个答案:

答案 0 :(得分:0)

问题的第一部分非常简单。首先将零值更改为缺失值,然后使用proc stdsize将缺失值更改为月份的平均值。

/* create initial dataset */
data have;
input value   date :date9. month year;
format date date9.;
datalines;
33.2    01SEP2016   9   2016
33.7    02SEP2016   9   2016
34.8    03SEP2016   9   2016
33.8    04SEP2016   9   2016
33.7    05SEP2016   9   2016
33.8    06SEP2016   9   2016
32.7    07SEP2016   9   2016
33.4    08SEP2016   9   2016
32.5    09SEP2016   9   2016
33.7    10SEP2016   9   2016
32.7    11SEP2016   9   2016
32.5    12SEP2016   9   2016
32.1    13SEP2016   9   2016
32.2    14SEP2016   9   2016
32.0    15SEP2016   9   2016
31.8    16SEP2016   9   2016
31.8    17SEP2016   9   2016
31.9    18SEP2016   9   2016
32.5    19SEP2016   9   2016
32.5    20SEP2016   9   2016
32.3    21SEP2016   9   2016
32.6    22SEP2016   9   2016
14.2    23SEP2016   9   2016
0.0     24SEP2016   9   2016
0.0     25SEP2016   9   2016
0.0     26SEP2016   9   2016
0.0     27SEP2016   9   2016
0.0     28SEP2016   9   2016
0.0     29SEP2016   9   2016
0.0     30SEP2016   9   2016
;
run;

/* replace zeros with missing */
data have;
modify have;
call missing(value);
where value=0;
run;

/* replace missing with mean of month */
proc stdize data=have out=want
            method=mean reponly;
by month year;
var value;
run;

答案 1 :(得分:0)

您可以使用proc sql生成新的结果集:

proc sql;
    select (case when t.value = 0 then t2.avg_value else value end) as value,
           t.date, t.month, t.year
    from t left join
         (select year, month, avg(value) as avg_value
          from t
          group by year, month
         ) t2
         on t.year = t2.year and t.month = t2.month;

如果你想将其标记为update,那么我会使用相关的子查询:

proc sql;
    update t
        set value = (select avg(t2.value)
                     from t t2
                     where t2.value <> 0 and
                           t2.year = t.year and t2.month = t.month
                    )
        where value = 0;