SAS:提取按日期排序的所有最后唯一组合

时间:2018-08-11 17:17:45

标签: sorting sas duplicates

在按时间安排的事件中,我很难提取工人执行的独特任务。唯一的组合由ID和模式定义。以下数据集模仿该场景:

  ID        Time       Mode     Event
 23456     20120101    A        Open
 23456     20120101    B        Closed
 87690     20120311    G        Closed
 98000     20120201    B        Open
 98000     20120301    A        Open
 98000     20120101    A        Open
 87889     20121009    C        Closed
 87889     20120101    C        Open
 87900     20120411    A        Closed
 87900     20120102    A        Closed

希望获得以下结果:

  ID        Time       Mode     Event
 23456     20120101    A        Open
 23456     20120101    B        Closed
 87690     20120311    G        Closed
 98000     20120201    B        Open
 98000     20120301    A        Open
 87889     20121009    C        Closed
 87900     20120411    A        Closed

我将首先按时间降序排列:

  proc sort data=df; by ID descending time; run;

然后我可以再次使用sort来通过ID和模式获得唯一的组合:

  proc sort data=df dupout=nodup nodupkey;
     by ID Mode; run;

在最后一步中,如何确保无重复记录也是最新事件?

谢谢!

1 个答案:

答案 0 :(得分:1)

您可以先使用。最后一个概念

data have;
 input ID        Time:yymmdd8.      Mode $    Event $;
 format time yymmdd10.;
  datalines;
 23456     20120101    A        Open
 23456     20120101    B        Closed
 87690     20120311    G        Closed
 98000     20120201    B        Open
98000     20120301    A        Open
98000     20120101    A        Open
87889     20121009    C        Closed
87889     20120101    C        Open
 87900     20120411    A        Closed
 87900     20120102    A        Closed
  ;

 proc sort data = have out=have1;
 by id mode time;
 run;

 data want;
 set have1;
 by id mode time;
if last.mode and last.time then output;
 run;

或者我可以如下所示简单proc sql

proc sql;
create table want1 as
select id, time, mode,  event from have
 group by  id, mode
 having time = max(time);

要使代码正常工作,在第一类中,您需要成为第一类  proc sort data = df;按ID模式下降时间;运行;