SAS - 按ID分组,如何将所有值转换为行的最小值1?

时间:2017-05-06 21:02:58

标签: sas

我想要一个数据步骤或SQL语句来执行以下操作。

考虑下表:

(之前)

id  div dlenfol repurch rlenfol
1   0   145      1         25
2   0   114      0         114
2   0   114      0         114
3   0   189      1         53
3   0   189      0         189
3   1   149      0         189
4   1   14       0         182
4   0   182      1         46
4   0   182      0         182

按ID分组,如何将dlenfol的所有值转换为dlenfol列中的最小值,并将rlenfol的所有值转换为rlenfol列中的最小值?

同时我还想创建一个名为choice的变量:

=1 if a certain id EVER had a div=1; 
=0 if a certain id EVER had a repurch=1 (but never had a div=1); 
=1 if a certain id EVER had a div=1 AND EVER had a repurch=1; 
and =. if the certain id never had a div=1 nor repurch=1.  

即。像这样:

(之后)

id  div dlenfol repurch rlenfol choice
1   0   145        1    25         0
2   0   114        0    114        .
2   0   114        0    114        .
3   0   149        1    53         1
3   0   149        0    53         1
3   1   149        0    53         1
4   1   14         0    46         1
4   0   14         1    46         1
4   0   14         0    46         1

我一直在尝试的代码无效:

data comb2d;
set comb;
do;
    set comb;
    by id;
    dmin = min(dlenfol, dmin);
    rmin = min(rlenfol, rmin);
    if dlenfol=dmin and rlenfol^=rmin then CHOICE=1;
    else if dlenfol^=dmin and rlenfol=rmin then CHOICE=0;
    else if dlenfol=dmin and rlenfol=rmin then CHOICE=1;
    else CHOICE=.;
    /* if DIV=1 and REPURCH=0 then CHOICE=1;
    else if DIV=0 and REPURCH=1 then CHOICE=0;
    else if DIV=1 and REPURCH=1 then CHOICE=1;
    else CHOICE=.; */
  end;
  dlenfol = dmin;
  rlenfol = rmin;
  /* drop dmin;
  drop rmin; */
run;

以下SQL代码似乎解决了最小值问题,但它创建了我不需要的2个变量(dmin和rmin):

proc sql;
create table comb3 as
select *, min(dlenfol) as dmin, min(rlenfol) as rmin
from comb
group by comb.id;
quit;

4 个答案:

答案 0 :(得分:0)

proc sort data=before out=sort1;
by id dlenfol;
run;

data sort1;
   drop temp;
set sort1;
retain temp;
   by id;
      if first.id then temp = dlenfol;
      else dlenfol = temp;
run;

then do the same thing for rlenfol

答案 1 :(得分:0)

如下:

proc sql;
create table comb3 as
select id,  div , repurch , min(dlenfol) as dlenfol, min(rlenfol) as relenfol
from comb
group by comb.id;
quit;

答案 2 :(得分:0)

以下代码现在似乎正在运作:

proc sql;
create table comb3 as
select *, min(dlenfol) as dmin, min(rlenfol) as rmin, max(choice) as choicemax
from comb
group by comb.gvkey
order by comb.gvkey, comb.fyear;
quit;

答案 3 :(得分:0)

这是一个可以通过双DOW循环处理的问题的一个很好的例子。在第一个循环中找到最小值并检查标志变量是否为真。 然后,您可以获得定义新CHOICE变量所需的信息,并将变量设置为最小值。

data want ;
  do until(last.id);
    set have ;
    by id ;
    mind=min(mind,dlenfol);
    minr=min(minr,rlenfol);
    anydiv=anydiv or div;
    anyrep=anyrep or repurch;
  end;
  if anydiv then choice=1;
  else if anyrep then choice=0;
  else choice=.;
  do until(last.id);
    set have;
    by id;
    dlenfol=mind;
    rlenfol=minr;
    output;
  end;
  drop mind minr anydiv anyrep;
run;