我有一个包含许多变量的数据集 - 其中许多都是字符值。我有以下代码来计算每个变量的缺失值的数量:
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=dataname;
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
然而,这导致了巨大的输出(如果我打印到pdf,则为300页pdf),其中90%的变量没有缺失值。如何告诉PROC FREQ仅显示缺少值的表?
答案 0 :(得分:4)
您可以从PROC FREQ中的NLEVELS选项中识别哪些变量具有缺失值。所以我的过程是创建一个只保存缺少值的变量的数据集,然后将它们存储在一个宏变量中,这样就可以对它们运行以下PROC FREQ。 这是执行此操作的代码。
/* set up dummy dataset */
data have;
set sashelp.class;
if _n_ in (10,13) then call missing(age,sex);
run;
/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0));
ods noresults;
proc freq data=have nlevels;
run;
ods results;
/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=have (keep=&missvar.);
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
答案 1 :(得分:1)
这个删除所有空白列:
%macro removeblanks(dataset,output);
/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0 and nnonmisslevels=0));
ods noresults;
proc freq data=&dataset. nlevels;
run;
/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;
data &output.;
set &dataset.(drop=&missvar.);
run;
%mend removeblanks;`