您好我正在开发一个SAS项目,我想创建一个虚拟变量来解释医学中的偏好''。我有一个很长的数据集,按时间段,服用1型或2型药物的人。对于我的研究,我想创建一个变量来表示服用1型药物的个体,然后切换到2型,但是回到第1类。我不关心个人服药的时间间隔,只是他们遵循这种模式。
id month type
1 1 2
1 2 2
1 3 2
2 1 1
2 2 2
2 3 1
...
我有更多的月份,但只是想提供一些东西来阐明我想要得到的东西。基本上,我想计算那些喜欢主题2的科目。
答案 0 :(得分:1)
DATA LONG1;
input id month type;
cards;
1 1 2
1 2 2
1 3 2
1 4 2
1 5 2
1 6 2
1 7 2
1 8 2
1 9 2
1 10 2
2 1 1
2 2 1
2 3 1
2 4 1
2 5 1
2 6 1
2 7 1
2 8 1
2 9 1
2 10 1
3 1 1
3 2 1
3 3 1
3 4 2
3 5 1
3 6 1
3 7 1
3 8 1
3 9 1
3 10 1
;
Proc Print; run;
* 1) make a wide dataset by deconstructing the initial long data by month & rejoining by id
2) then use if/then statements to create your dummy variable,
3) then merge the dummy variable back into your long dataset using ID;
DATA month1; set long1; where month=1; rename month=month_1 type=type_1; Proc Sort; by ID; run;
DATA month2; set long1; where month=2; rename month=month_2 type=type_2; Proc Sort; by ID; run;
DATA month3; set long1; where month=3; rename month=month_3 type=type_3; Proc Sort; by ID; run;
DATA month4; set long1; where month=4; rename month=month_4 type=type_4; Proc Sort; by ID; run;
DATA month5; set long1; where month=5; rename month=month_5 type=type_5; Proc Sort; by ID; run;
DATA month6; set long1; where month=6; rename month=month_6 type=type_6; Proc Sort; by ID; run;
DATA month7; set long1; where month=7; rename month=month_7 type=type_7; Proc Sort; by ID; run;
DATA month8; set long1; where month=8; rename month=month_8 type=type_8; Proc Sort; by ID; run;
DATA month9; set long1; where month=9; rename month=month_9 type=type_9; Proc Sort; by ID; run;
DATA month10; set long1; where month=10; rename month=month_10 type=type_10; Proc Sort; by ID; run;
DATA WIDE;
merge month1 month2 month3 month4 month5 month6 month7 month8 month9 month10; by ID;
if (type_1=1 and type_2=1 and type_3=1 and type_4=1 and type_5=1
and type_6=1 and type_7=1 and type_8=1 and type_9=1 and type_10=1) or
(type_1=2 and type_2=2 and type_3=2 and type_4=2 and type_5=2
and type_6=2 and type_7=2 and type_8=2 and type_9=2 and type_10=2)
then switch='no '; else switch='yes '; keep ID switch; run;
DATA LONG2;
merge wide long1; by ID;
Proc Print; run;
顺便说一句:也去SAS listserv,他们喜欢这样的东西:
http://www.listserv.uga.edu/archives/sas-l.html
答案 1 :(得分:0)
这适用于我使用的有限数据:
DATA Have;
input id month type;
datalines;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 1
2 1 1
2 2 2
2 3 1
2 4 1
2 5 1
3 1 1
3 2 1
3 3 2
3 4 2
3 5 1
4 1 2
4 2 2
4 3 2
4 4 2
4 5 2
;
Data Temp(keep=id dummy);
length dummy $15;
retain Start Type2 dummy;
set Have;
by id;
if first.id then Do;
Start=0;
Type2=0;
Dummy="";
end;
If Type=1 then do;
If Start=0 then Start=1;
else if Start=1 and Type2=1 then Dummy="Switch-er-Roo";
end;
else do;
if Start=1 then Type2=1;
end;
if last.id then output;
run;
Data Want;
merge temp(in=a) have(in=b);
by id;
run;
答案 2 :(得分:0)
我更喜欢@ CarolinaJay65方法,它更清洁,只涉及一次数据传递。如果你感兴趣的是那些在Type1上开始和结束但在某个时候使用Type2的患者,那么代码可以稍微简化一下。以下代码(使用@ CarolinaJay65源数据)仅输出匹配此条件的patient_id。
data switch_id (keep=id);
set have;
by id month;
retain switch;
if first.id then do;
call missing(switch);
if type=1 then switch=0;
end;
else if not missing(switch) and type=2 then switch=1;
if last.id and type=1 and switch=1 then output;
run;
如果您只想要符合条件的患者数量,那么您可以进一步调整此代码。
data switch (keep=count);
set have end=final;
by id month;
retain switch count 0;
if first.id then do;
call missing(switch);
if type=1 then switch=0;
end;
else if not missing(switch) and type=2 then switch=1;
if last.id and type=1 and switch=1 then count+1;
if final then output;
run;
答案 3 :(得分:0)
我认为以下内容应该有效:
DATA Have;
input id month type;
if _n_ ^= 1 and id ^= lag(id) then diftype = .;
else diftype = dif(type);
datalines;
1 1 1
1 2 1
1 3 1
1 4 1
1 5 1
2 1 1
2 2 2
2 3 1
2 4 1
2 5 1
3 1 1
3 2 1
3 3 2
3 4 2
3 5 1
4 1 2
4 2 2
4 3 2
4 4 2
4 5 2
;
proc sql;
select case when max(diftype) = 1 and min(diftype) = -1 then 1 else 0 end as flag, * from have
group by id
;
quit;