在SAS中展平多个观测值

时间:2013-01-09 08:22:42

标签: sas

我有一个数据集,患者可以为某些变量设置多个(和未知)值,最终看起来像这样:

    ID   Var1   Var2   Var3   Var4
    1    Blue   Female 17     908
    1    Blue   Female 17     909
    1    Red    Female 17     910
    1    Red    Female 17     911
...
    99   Blue   Female 14     908
    100  Red    Male   28     911

我希望将这些数据打包,以便每个ID只有一个条目,并指示其原始条件中是否存在其中一个值。所以,例如,像这样:

ID   YesBlue   Var2      Var3   Yes911
1    1         Female    17     1
99   1         Female    14     0
100  0         Male      28     1

在SAS中有直接的方法吗?或者失败,在Access(数据来自哪里),我不知道如何使用。

4 个答案:

答案 0 :(得分:3)

如果您的数据集名为PATIENTS1,可能是这样的:

proc sql noprint;
  create table patients2 as
  select *
        ,case(var1)
           when "Blue" then 1
           else 0
         end as ablue
        ,case(var4)
           when 911 then 1
           else 0
         end as a911
        ,max(calculated ablue) as yesblue
        ,max(calculated a911) as yes911
  from patients1
  group by id
  order by id;
quit;

proc sort data=patients2 out=patients3(drop=var1 var4 ablue a911) nodupkey;
  by id;
run;

答案 1 :(得分:2)

这是一个数据步骤解决方案。我假设Var2和Var3的值对于给定的ID总是相同的。

data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
;
run;

data want (drop=Var1 Var4 _:);
set have;
by ID;
if first.ID then do;
    _blue=0;
    _911=0;
end;
_blue+(Var1='Blue');
_911+(Var4=911);
if last.ID then do;
    YesBlue=(_blue>0);
    Yes911=(_911>0);
    output;
end;
run;

答案 2 :(得分:1)

编辑:看起来像Keith说的那样,只是用不同的方式写的。

这应该这样做:

data test;
input id Var1 $ Var2 $ Var3 Var4;
datalines;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
run;

data flatten(drop=Var1 Var4);
set test;
retain YesBlue;
retain Yes911;
by id;

if first.id then do;
  YesBlue = 0;
  Yes911 = 0;
end;

if Var1 eq "Blue" then YesBlue = 1;
if Var4 eq 911 then Yes911 = 1;

if last.id then output;
run;

答案 3 :(得分:1)

PROC SQL非常适合这样的事情。这类似于DavB的答案,但消除了额外的类别:

data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1    Blue   Female 17     908
1    Blue   Female 17     909
1    Red    Female 17     910
1    Red    Female 17     911
99   Blue   Female 14     908
100  Red    Male   28     911
;
run;

proc sql;
  create table want as
  select ID
       , max(case(var1)
               when 'Blue'
               then 1
               else 0 end) as YesBlue
       , max(var2)         as Var2
       , max(var3)         as Var3
       , max(case(var4)
               when 911
               then 1
               else 0 end) as Yes911
  from have
  group by id
  order by id;
quit;

它还可以通过ID变量安全地减少原始数据,但如果源与您描述的不完全相同则存在可能出错的风险。