SAS:根据随时间变化的值进行人行横道

时间:2017-05-19 11:41:27

标签: sql sas

我有问题。当id随着时间的推移而改变时,我需要为另一个表创建一个人行横道。无论身份如何变化,我都需要拥有第一年的身份证。

data have;
input ORIG_ID $ CHANGE_ID $ YEAR
AAA  BBB  1990
BBB  AAA  1991
PPP  ZZZ  1993
ZZZ  YYY  1994
YYY  ZZZ  1996
TTT  MMM  1990
;

**What I want :**

 /****OUTPUT****/

CHANGE_ID ORIG_ID
BBB       AAA  
ZZZ       PPP
YYY       PPP
MMM       TTT
/*My logic so far*/
proc sql;
    create table temp as
    select CHANGE_ID, ORIG_ID
    case 
        when (CHANGE_ID<ORIG_ID) then cat(CHANGE_ID,ORIG_ID)
        when (ORIG_ID<CHANGE_ID) then cat(ORIG_ID,CHANGE_ID)
        end as key, year
    from dat
    order by key,year;
quit;

data final;
    retain CHANGE_ID ORIG_ID
    set temp;
    by key;
    if first.key;
run;
/*But this works for id changing for AAA to BBB, may be not*/

如果您有任何疑惑,请告诉我:

1 个答案:

答案 0 :(得分:1)

这是一个哈希对象方法 - 它应该可以工作

  • 您有足够的内存来容纳整个输入表
  • 每年每个ID只有1次更改,或
  • 如果每年每个ID有多处更改,则可以通过某种方式更改订单


data have;
input ORIG_ID $ CHANGE_ID $ YEAR;
cards;
AAA  BBB  1990
BBB  AAA  1991
PPP  ZZZ  1993
ZZZ  YYY  1994
YYY  ZZZ  1996
TTT  MMM  1990
;
run;

proc sort data = have;
by CHANGE_ID descending YEAR;
run;

data v_have /view = v_have;
    set have;
    by CHANGE_ID;
    if first.CHANGE_ID then GRP = 0;
    GRP + 1;    
run;


data want;
  informat CHANGE_ID ORIG_ID;
  set v_have;
  /*Although we need the whole table in the hash, we only need to process each CHANGE_ID once*/
  by CHANGE_ID;
  if first.CHANGE_ID;

  /*Create a hash object to hold the table*/
  if _n_ = 1 then do;
    dcl hash h(dataset:'v_have');
    rc = h.definekey('CHANGE_ID','GRP');
    rc = h.definedata('ORIG_ID','YEAR');
    rc = h.definedone();
  end;

  /*Make a temp copy of the starting ID*/
  T_CHANGE_ID = CHANGE_ID;
  PREV_YEAR = 2000;
  rc=0;

  /*Follow chains of IDs backwards through the table, making sure we only step backwards in time to avoid looping*/
  do while(rc = 0 and YEAR < PREV_YEAR);
    PREV_YEAR = YEAR;
    CHANGE_ID = ORIG_ID;
    /*If there are multiple records per CHANGE_ID, take the most recent one that's younger than the current record*/
    do GRP = 1 by 1 while(rc=0 and YEAR >= PREV_YEAR);
        rc = h.find();
    end;
  end;

  /*Output the last CHANGE_ID we got to plus the starting ID */
  ORIG_ID   = CHANGE_ID;  
  CHANGE_ID = T_CHANGE_ID;

  /*Ignore trivial rows resulting from cycles*/
  if CHANGE_ID ne ORIG_ID;
  keep CHANGE_ID ORIG_ID;
run;

现在有点复杂但工作正常。我可能会使用多数据哈希对象进行此操作,因为我认为可以通过这种方式消除初始排序。