DATA have;
infile datalines DELIMITER=',';
INFORMAT id 2. type $1. date1 date2 MMDDYY10. ;
INPUT id date1 type date2;
format date1 date9.
date2 date9.;
DATALINES;
1,02/09/2012,W,02/09/2012
2,05/16/2012,W,05/18/2012
2,06/18/2012,W,06/18/2012
2,06/18/2012,T,. < drop: same date
3,08/08/2011,W,08/08/2012
3,09/13/2011,W,09/13/2012
4,06/08/2016,W,06/12/2016
4,06/11/2016,T,. < drop: between 6/8 and 6/12
5,08/16/2012,W,08/16/2012
5,08/16/2012,W,08/30/2012
5,08/24/2012,T,. < drop: btw 8/16 and 8/22
6,09/05/2012,W,09/06/2012
7,09/05/2012,W,09/05/2012
7,09/07/2012,W,09/08/2012
7,08/03/2011,W,08/03/2011
7,05/01/2012,W,05/09/2012
7,04/30/2012,T,. <keep: as not between
8,03/31/2017,W,04/01/2017
8,03/06/2017,T,.
8,03/06/2017,L,.
8,07/03/2018,T,.
9,02/17/2016,T,. < drop same day
9,02/18/2016,L,. < drop between 2/17 and 2/22 day
9,02/17/2016,W,02/22/2016
;
run;
有W,T,L 3种类型 第1步-如果W以外的其他任何类型都在相同ID的date1和date2之间,则该记录必须不出现,但根据删除的类型记录,包括W&T或W&L作为新类型。
PROC SORT DATA= have;
BY ID DATE1 TYPE ;
RUN;
proc sql;
create table comb2 as
Select id, date1, type,date2 from have t
Where type = "W" or
not exists(select date1 from have
where id = t.id and t.date1 BETWEEN date1 and date2 and
type ="W")
;
run;
但这不会附加被删除的类型,就像如果删除了T类型,则相应W的记录应将W&T指示为新类型。
第二步,取决于提到的值 %let days_btw = 5; 然后应根据宏&days_btw来删除days_btw限制之间的任何记录,并且应保留最早的记录以获取ID,并且应该附加其余类型,但如果日期是同一类型,则不要附加
%Let DAYS_BTW=5;
proc sort data=have; by id date1; run;
data want;
drop date2;
do until(last.id);
set have; by id;
if missing(lastDate) or
intck("day", lastDate1, date1) > &DAYS_BTW or
lastType = type then do;
output;
lastdate1 = date1;
lasttype = type;
end;
end;
drop last: ;
run;
这将根据需要提供记录,但不会追加为newtype。如何创建一个字段new_type来指示已删除的记录(例如W&T)