我有一个数据集,患者可以为某些变量设置多个(和未知)值,最终看起来像这样:
ID Var1 Var2 Var3 Var4
1 Blue Female 17 908
1 Blue Female 17 909
1 Red Female 17 910
1 Red Female 17 911
...
99 Blue Female 14 908
100 Red Male 28 911
我希望将这些数据打包,以便每个ID只有一个条目,并指示其原始条件中是否存在其中一个值。所以,例如,像这样:
ID YesBlue Var2 Var3 Yes911
1 1 Female 17 1
99 1 Female 14 0
100 0 Male 28 1
在SAS中有直接的方法吗?或者失败,在Access(数据来自哪里),我不知道如何使用。
答案 0 :(得分:3)
如果您的数据集名为PATIENTS1,可能是这样的:
proc sql noprint;
create table patients2 as
select *
,case(var1)
when "Blue" then 1
else 0
end as ablue
,case(var4)
when 911 then 1
else 0
end as a911
,max(calculated ablue) as yesblue
,max(calculated a911) as yes911
from patients1
group by id
order by id;
quit;
proc sort data=patients2 out=patients3(drop=var1 var4 ablue a911) nodupkey;
by id;
run;
答案 1 :(得分:2)
这是一个数据步骤解决方案。我假设Var2和Var3的值对于给定的ID总是相同的。
data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1 Blue Female 17 908
1 Blue Female 17 909
1 Red Female 17 910
1 Red Female 17 911
99 Blue Female 14 908
100 Red Male 28 911
;
run;
data want (drop=Var1 Var4 _:);
set have;
by ID;
if first.ID then do;
_blue=0;
_911=0;
end;
_blue+(Var1='Blue');
_911+(Var4=911);
if last.ID then do;
YesBlue=(_blue>0);
Yes911=(_911>0);
output;
end;
run;
答案 2 :(得分:1)
这应该这样做:
data test;
input id Var1 $ Var2 $ Var3 Var4;
datalines;
1 Blue Female 17 908
1 Blue Female 17 909
1 Red Female 17 910
1 Red Female 17 911
99 Blue Female 14 908
100 Red Male 28 911
run;
data flatten(drop=Var1 Var4);
set test;
retain YesBlue;
retain Yes911;
by id;
if first.id then do;
YesBlue = 0;
Yes911 = 0;
end;
if Var1 eq "Blue" then YesBlue = 1;
if Var4 eq 911 then Yes911 = 1;
if last.id then output;
run;
答案 3 :(得分:1)
PROC SQL
非常适合这样的事情。这类似于DavB的答案,但消除了额外的类别:
data have;
input ID Var1 $ Var2 $ Var3 Var4;
cards;
1 Blue Female 17 908
1 Blue Female 17 909
1 Red Female 17 910
1 Red Female 17 911
99 Blue Female 14 908
100 Red Male 28 911
;
run;
proc sql;
create table want as
select ID
, max(case(var1)
when 'Blue'
then 1
else 0 end) as YesBlue
, max(var2) as Var2
, max(var3) as Var3
, max(case(var4)
when 911
then 1
else 0 end) as Yes911
from have
group by id
order by id;
quit;
它还可以通过ID变量安全地减少原始数据,但如果源与您描述的不完全相同则存在可能出错的风险。