我手头有一个数据集,如下所示:
ID MED1 MED2 MED3 MED4
1 892 384 454 345
2 802 394 434 233
3 852 384 334 599
我希望对数据集进行子集化,以便只有在代码{892,334,599,384}中使用处方药的患者仍然存在。我不想在数据步骤中重复所有4个变量中的4个代码列表。谁能告诉我怎么做?谢谢。
答案 0 :(得分:2)
我可能很懒惰和/或读错了问题,但我只是使用宏变量并完成它。像这样:
%let medcodes=892,334,599,384;
data want;
set have;
where MED1 in (&medcodes.)
or MED2 in (&medcodes.)
or MED3 in (&medcodes.)
or MED4 in (&medcodes.);
run;
不漂亮,没什么特别的,也许不会在一些复杂的生产系统中使用它,但是对于一次性任务来说这项工作不会过于复杂。
答案 1 :(得分:1)
如果每位患者只有一条记录,则可以使用数组
进行记录data patientMedsData;
infile datalines delimiter=' ';
input ID MED1 MED2 MED3 MED4;
datalines;
1 892 384 454 345
2 802 394 434 233
3 852 384 334 599
;
run;
data result;
set patientMedsData;
array med [*] med1-med4;
array prescribedMeds [4] _temporary_ (892, 334, 599, 384);
retain keepPatient;
drop keepPatient;
* Reset the flag for each new record;
keepPatient = 0;
* Look for the prescribed meds;
drop i j;
do i = 1 to dim(med);
do j = 1 to dim(prescribedMeds);
if med[i] eq prescribedMeds[j] then do;
keepPatient = 1;
* Leave the loops as the med is found;
i = dim(med) + 1;
leave;
end;
end;
end;
* Filter patients without the prescribed meds;
if keepPatient ne 1 then do;
delete;
end;
run;
答案 2 :(得分:0)
医疗代码通常具有有限的域名。对于3位代码的情况,您可以使用直接地址标记:
data have;
input ID MED1 MED2 MED3 MED4;
datalines;
1 892 384 454 345
2 802 394 434 233
3 852 384 334 599
;
data want;
set have;
array flags (0:999) _temporary_;
array vars med1-med4;
do _n_ = 892, 334, 599, 384; flags(_n_) = 0; end;
do _n_ = 1 to dim(vars); flags(vars(_n_)) = 1; end;
do _n_ = 892, 334, 599, 384; flag_count=sum(flag_count, flags(_n_)); end;
if flag_count > 0; * adjust pass through threshold as needed;
run;
更通用的版本会列出数组中的目标,而不是do循环索引列表:
data want;
set have;
array targets(4) _temporary_ (892, 334, 599, 384);
array flags (0:999) _temporary_;
array vars med1-med4;
do _n_ = 1 to dim(targets); flags(targets(_n_)) = 0; end;
do _n_ = 1 to dim(vars); flags(vars(_n_)) = 1; end;
do _n_ = 1 to dim(targets); flag_count = sum(flag_count, flags(targets(_n_))); end;
if flag_count > 0; * adjust pass through threshold as needed;
run;