我有一个包含大量变量的数据集。例如:
ID v1 v2 v3 v4 v5 v6 v7 v8
1 4 1 2 2 2 2 1 2
2 2 3 1 4 3 4 4 2
3 3 5 1 3 4 3 4 3
4 3 1 2 3 2 2 4 2
5 5 1 5 5 3 5 1 5
...
我想取每个变量的平均值,存储它,然后能够将它用于其他数据集。
到目前为止,我所尝试的是每个变量,一遍又一遍:
proc means data=data;
var v1;
output out=v1out mean=meanv1;
run;
proc means data=data;
var v2;
output out=v2out mean=meanv2;
run;
...
然后,为每个(再次):
data v1temp;
set v1;
call symput("meanv1",meanv1);
run;
data v2temp;
set v2;
call symput("meanv2",meanv2);
run;
...
但是这对于很多变量来说非常繁琐。有没有更简单的方法?
答案 0 :(得分:1)
我想取每个变量的平均值,存储它,然后是 能够将它用于其他数据集。
使用全局宏变量似乎没有优势。另一种选择是计算上述@ user102890建议的均值:
proc means data = myData noprint;
var v1-v8;
output out = myDataMeans(drop = _type_ _freq_
where = (_stat_='MEAN')
rename = (v1-v8 = meanV1-meanV8));
run;
然后将一个观察结果设置到您的数据集中:
DATA myData;
set myData;
if _N_ = 1 then set myDataMeans;
...;
RUN;
然后,您可以在每次观察数据集meanV1-meanV8
时将变量data
作为实际数据集值。您可以对要使用这些变量的任何其他数据集执行相同的操作。
答案 1 :(得分:1)
看看PROC SQL的强大功能;)
data myData;
input id v1-v8;
datalines;
1 4 1 2 2 2 2 1 2
2 2 3 1 4 3 4 4 2
3 3 5 1 3 4 3 4 3
4 3 1 2 3 2 2 4 2
5 5 1 5 5 3 5 1 5
;
run;
proc transpose data= myData out= myXData;
by id;
var v1-v8;
run;
proc sql noprint;
select mean( col1 )
into :mean1 - :mean8
from myXData
group by _name_
;
quit;
%put &mean1 &mean2 &mean3 &mean4 &mean5 &mean6 &mean7 &mean8;
日志输出:
171
172 %put &mean1 &mean2 &mean3 &mean4 &mean5 &mean6 &mean7 &mean8;
3.4 2.2 2.2 3.4 2.8 3.2 2.8 2.8
我仍然认为宏变量不是存储顺序数据的最佳方式。
答案 2 :(得分:0)
data myData;
input id v1-v8;
datalines;
1 4 1 2 2 2 2 1 2
2 2 3 1 4 3 4 4 2
3 3 5 1 3 4 3 4 3
4 3 1 2 3 2 2 4 2
5 5 1 5 5 3 5 1 5
;
run;
proc means data = myData noprint;
var v1-v8;
output out = myDataMeans(drop = _type_ _freq_
where = (_stat_='MEAN')
rename = (v1-v8 = meanV1-meanV8));
run;
输出数据集myDataMeans
如下所示:
_STAT_ meanV1 meanV2 meanV3 meanV4 meanV5 meanV6 meanV7 meanV8
MEAN 3.4 2.2 2.2 3.4 2.8 3.2 2.8 2.8
以下内容将读取myDataMeans
数据集并将其中的每一列放入其自己的宏变量中。
%let dsid=%sysfunc(open(myDataMeans,i));/*open the dataset which has macro vars to read in cols*/
%syscall set(dsid); /*no leading ampersand with %SYSCALL */
%let rc=%sysfunc(fetchobs(&dsid,1));/*just reading 1 obs*/
%let rc=%sysfunc(close(&dsid));/*close dataset after reading*/
%put _user_;
创建以下全局宏变量,如日志中所示:
GLOBAL _STAT_ MEAN
GLOBAL MEANV1 3.4
GLOBAL MEANV2 2.2
GLOBAL MEANV3 2.2
GLOBAL MEANV4 3.4
GLOBAL MEANV5 2.8
GLOBAL MEANV6 3.2
GLOBAL MEANV7 2.8
GLOBAL MEANV8 2.8