sas hash join vs data step merge

时间:2016-04-12 13:59:35

标签: hash merge sas

我必须合并三个表,每个表有100K -200K之间的记录。这次合并大约需要五分钟。我需要帮助将下面的代码转换为散列连接。非常感谢提前。

data &dsource..main_input;
   merge &dsource..sorted_swf (in=here1)
         &dsource..sorted_input2 (in=here2)
         &dsource..sorted_input9 (in=here3 );
   by control;
   if (here1) then do;
      %recode_div    

      if typec = 45 then elig_hu= '1';    
      else if (status eq '1') or ((status in ('2','3')) and (type in ('1','2','4','6','10','11'))) then elig_hu = '1';
      else if (status eq '4') then do;
         if (noint in (1,2,3,5,6)) then elig_hu = '1';
         else if (noint eq 4) or (10 <= noint <= 43) then elig_hu = '0';
      end;
      else if (status in ('2','3')) and (type in ('5','7','8','9'))           then elig_hu = '0';
      else elig_hu = '9';
      output;

      keep var1 var2 var3 var4;
   end;
run;

data want;
   set finput.input2;
   if _n_ = 1 then do;
      %create_hash(in2,control,region,"ftest5.swf");
      %create_hash(in9,control,hudadmin,"ftest5.input9");
   end;
   /*<initialize lookup variables>*/
   rc = in2.find();
   rc = in9.find();
   if rc then do;
      /*  <handle case where lookup fails>*/
   end;
   drop rc;
run;

1 个答案:

答案 0 :(得分:0)

here

抓取%create_hash()宏

一般用途是

data want;
set have;
format <new variables to look up>;

if _n_ = 1 then do;
   %create_hash(obj,keyvar1 keyvar2 ..., lookupvar1 lookupvar2 ..., "lookup data set");
end;

<initialize lookup variables>
rc = obj.find();

if rc then do;
  <handle case where lookup fails>
end;

drop rc;
run;