SAS创建新变量

时间:2018-12-11 23:59:43

标签: if-statement sas where transpose retain

我想使用变量A到D创建一个新的变量“ Weight”和“ Height”:

DATA: 
A       B     C    D       
Jim  Weight  180   Screen
Jim  Weight  200   C1
Jim  Height  60    Screen
Jim  Height  61    C3
Tod  Weight  190   Screen
Tod  Weight  201   C1
Tod  Height  70    Screen
Tod  Height        C1

Weight变量具有以下条件:如果B列=重量,D列= C1和C列不丢失,则将Weight设置为C列。否则,如果D列不是C1或C列丢失,则使用C列,其中D列是屏幕。简而言之,假设吉姆在筛查时被称重,而不是在C1中称重,那么我想保持他的筛查重量。或者,如果他在C1筛查但体重不见了,我想保持他的筛分重量。对于Height变量也是如此。

我的错误代码是:

DATA MYTEST; 
    SET TEST.TEST; 
    if B = 'WEIGHT' and D = 'C1D1' and not missing(C) then NEW = C;
    else if (missing(C) or D ~= 'C1') and B = 'WEIGHT' then WEIGHT = C where D = 'Screen';
    if B = 'HEIGHT' and D = 'C1D1' and not missing(C) then NEW = C;
    else if (missing(C) or D ~= 'C1') and B = 'HEIGHT' then WEIGHT = C where D = 'Screen';
    else WEIGHT = 'NA';
 RUN; 
 PROC PRINT DATA = MYTEST; 
 RUN; 

所需结果:

DATA: 
A    Weight   Height 
Jim   200       60
Tod   201       70

1 个答案:

答案 0 :(得分:1)

可以使用update语句创建结果数据,并将其应用于转置后的数据。 UPDATEMERGE的不同之处在于,更新数据集中缺少的值将永远不会覆盖PDV中的现有值。

DATA have;
input 
a $  b $     c     d $; datalines;
Jim  Weight  180   Screen
Jim  Weight  200   C1
Jim  Height  60    Screen
Jim  Height  61    C3
Tod  Weight  190   Screen
Tod  Weight  201   C1
Tod  Height  70    Screen
Tod  Height  .     C1
run;

proc transpose data=have out=haveT;
  by a d notsorted;
  var c;
  id b;
run;

data haveScreen / view=haveScreen;
  set haveT;
  where d='Screen';
  by a;
  if first.a;
run;

data want;
  update
    haveScreen
    haveT (where=(d in ('Screen', 'C1')))
  ;
  by a;
run;

您发布的代码未正确使用WHERE。 where子句不是有条件应用的,并且在数据步骤运行时也不能动态更改。在“运行初始化”时应用where子句。 where子句是不可执行/无条件的语句,而在数据步骤代码中最后出现的是该步骤运行时将应用的子句。

例如,在以下情况中,if 0永远不会为真,但无论如何都会应用where

options msglevel=i;
data _null_;
  set sashelp.class;
  if name =: 'X' then where age > 12;
  if 0 then where age > 1;
run;
----- LOG -----
4625  options msglevel=i;
4626  data _null_;
4627    set sashelp.class;
4628    if name =: 'X' then where age > 12;
4629    if 0 then where age > 1;
NOTE: WHERE clause has been replaced.
4630    run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
      WHERE age>1;