Question

在我的sas数据集中有组，即id，我想要在某个变量中删除缺少值的组。

例如，我有这个sas数据集：

data have;
    input v1 v2 v3 id;
datalines;
9 7 210 1
0 6 .   1
9 3 320 2
6 1 .   1
9 4 432 2
;
run;

我试过了：

/*Order by id*/ 
proc sort data=have;
     by id;
run;

/*Select no missing observations by id*/
data=want;
set=have;
if cmiss(of _all_) then delete;
run;

但是，此代码不会排除缺少值的ID。它删除了缺失值。

Answer 1

嗯。您可以使用proc sql; delete from have where exists (select 1 from have have2 where have.id = have2.id and (have2.v1 is null or have2.v2 is null or have2.v3 is null);：

DeveloperAuthenticationProvider developerProvider = new DeveloperAuthenticationProvider( null, "IDENTITYPOOLID", context, Regions.USEAST1);
CognitoCachingCredentialsProvider credentialsProvider = new CognitoCachingCredentialsProvider( context, developerProvider, Regions.USEAST1);

Answer 2

一个想法可能是使用双DOW循环。首先检查是否有任何缺失值，然后是第二个输出没有缺失值的id的记录。

data have;
  input v1 v2 v3 id;
datalines;
9 7 210 1
0 6 .   1
9 3 320 2
6 1 .   1
9 4 432 2
1 2 333 3
;

您需要按照示例进行排序。

data want ;
  do until (last.id);
    set have;
    by id;
    anymissing=max(anymissing,cmiss(of v1-v3));
  end;
  do until (last.id);
    set have;
    by id;
    if not anymissing then output;
  end;
run;

Answer 3

您只是不希望结果数据集中包含缺少列的行。那么为什么要删除，只需在编写result-dataset或覆盖source-Dataset时将它们排除。：

data have;/*overwriting my have dataset instead of deleting lines*/
set have;
if not cmiss(of _ALL_);
run;

当你想删除一个组的所有行时，如果只有一行有缺失值，你可以这样做，如果它没有值存储一个ID，然后不写任何具有该id的行，你只需要获取ID行你想要的结果。重要的是，具有缺失值的ID在数据集中是第一位的，但由于proc sort：

，这应该是那样的

data want;
retain x;
set have;
if cmiss(of _ALL_) then
x= id;
if x ne id;
run;

如果缺少观察结果，则删除SAS组

3 个答案: