我有一个包含5000行和150个变量的sas数据集来自5000名受访者的调查,但是我需要删除整个行/响应者,其中列缺少对150个变量中的任何一个的观察。 所以基本上,我只需要那些已完成150个变量答案的受访者。
我正在使用proc sql或base sas,但我无法想出更简单的方法来执行此操作。 我使用了条件查询,但有些列是数字的,有些是字符类型 而且我还需要对数字列进行分析,因此转置似乎不是替代.. 任何帮助将不胜感激?
由于
答案 0 :(得分:2)
使用数据步骤只需:
data want;
set have;
if cmiss(of _all_) = 0;
run;
将处理字符和数字变量。
答案 1 :(得分:0)
SAS procs倾向于通过从正在分析的数据中删除整行来忽略缺失值。所以,这可能不像你想象的那么严重。也就是说,如果您正在进行前向选择逻辑回归,请添加一组变量,然后只处理那些列没有缺失值的行。
如果要创建列没有缺失值的新数据集,可以执行以下操作:
proc sql;
create table t_nomissing
select t.*
from t
where col1 is not null and col2 is not null and col3 is not null and . . .
col150 is not null;
如果您有列名列表,我建议您在Excel等工具中创建where
子句,您可以使用公式并将其复制下来。
答案 2 :(得分:0)
让戈登·林诺夫(Gordon Linoff)关于Excel的想法更进一步只用SAS ......
ods output SQL_Results=appliance;
proc sql number;
select * from sashelp.applianc;
quit;
data appliance_2;
set appliance;
if cmiss(of _all_) = 0;
run;
proc sql; create table que as select * from dictionary.columns where libname = "WORK" and memname = "APPLIANCE"; quit;
proc sql ;
select name, "IS NOT NULL AND"
from dictionary.columns where libname = "WORK" and memname = "APPLIANCE";
quit;
*copy / paste / clean-up ;
proc sql;
create table appliance_3 as
select * from appliance
where
Row IS NOT NULL AND
units_1 IS NOT NULL AND
units_2 IS NOT NULL AND
units_3 IS NOT NULL AND
units_4 IS NOT NULL AND
units_5 IS NOT NULL AND
units_6 IS NOT NULL AND
units_7 IS NOT NULL AND
units_8 IS NOT NULL AND
units_9 IS NOT NULL AND
units_10 IS NOT NULL AND
units_11 IS NOT NULL AND
units_12 IS NOT NULL AND
units_13 IS NOT NULL AND
units_14 IS NOT NULL AND
units_15 IS NOT NULL AND
units_16 IS NOT NULL AND
units_17 IS NOT NULL AND
units_18 IS NOT NULL AND
units_19 IS NOT NULL AND
units_20 IS NOT NULL AND
units_21 IS NOT NULL AND
units_22 IS NOT NULL AND
units_23 IS NOT NULL AND
units_24 IS NOT NULL AND
cycle IS NOT NULL
;quit;