在SAS中对动态列数进行排名

时间:2018-10-23 11:20:49

标签: sas

我有一个数据集,其中包含键和针对不同因素(A,B,C,D ...)的分数。看起来像这样:

data scores;
input KEY A B C D E F G H;
cards; 
1 1 2 4 4 4 9 9 7   
2 1 2 3 4 5 6 7 8   
3 7 8 9 9 6 5 5 4   
4 4 9 9 7 7 8 5 1 
run;

我正在尝试对看起来像这样的输出的每个因素进行排名:

proc sql;
create table scorerank as
select   *
        ,(ordinal(1,A,B,C,D,E,F,G,H)) as ScoreRank1
        ,(ordinal(2,A,B,C,D,E,F,G,H)) as ScoreRank2
        ,(ordinal(3,A,B,C,D,E,F,G,H)) as ScoreRank3
        ,(ordinal(4,A,B,C,D,E,F,G,H)) as ScoreRank4
        ,(ordinal(5,A,B,C,D,E,F,G,H)) as ScoreRank5
        ,(ordinal(6,A,B,C,D,E,F,G,H)) as ScoreRank6
        ,(ordinal(7,A,B,C,D,E,F,G,H)) as ScoreRank7
        ,(ordinal(8,A,B,C,D,E,F,G,H)) as ScoreRank8
from scores;
quit;

我的问题是每次都有动态数量的因素。这意味着,序数函数中有一个动态列表,而ScoreRankX可以增加到Scores的数量。

我已经尝试以此作为开始:

%let num = 8;
%let factors = A,B,C,D,E,F,G,H;

data datarank;
set scores;
do i = 1 to &num.;
ScoreRank&num. = (ordinal(&num.,&factors.));
end;
run;

我可以在每个代码的开头更改%let语句,但是我试图使排名部分更加自动化。关于如何改进上面正在处理的代码的任何想法?目前,它的输出不正确,只有最后一位和“ i”(即使我有do循环?)。

非常感谢任何帮助。

2 个答案:

答案 0 :(得分:4)

对于行之间的排名,我认为这是您想要的。如果您想以不同的方式命名RANK变量,则可以像在其他答案中一样使用EXPAND_VARLIST。那么您就不需要&num。

%let num = 8;
data datarank;
   set scores;
   array score[*] a--h;
   array Rank[&num];
   do i = 1 to dim(rank);
      Rank[i] = ordinal(i,of score[*]);
      end;
   drop i;
   run;
proc print;
   run;

enter image description here

答案 1 :(得分:2)

您要对列或行进行排名吗? ORDINAL函数对行进行排名。似乎您应该使用PROC RANK和rank列。 PROC RANK的默认设置是对等级使用相同的变量,这种变量在大多数情况下效果很好,但是您似乎想要新的名称,因此我为此添加了方法。

data scores;
   input KEY A B C D E F G H;
   cards; 
1 1 2 4 4 4 9 9 7   
2 1 2 3 4 5 6 7 8   
3 7 8 9 9 6 5 5 4   
4 4 9 9 7 7 8 5 1 
   run;
proc print;
   run;
%let ranks=%expand_varlist(data=ranks,var=a-numeric-h,expr=cats('Rank_',_name_));
proc rank out=ranks ties=mean;
   var a--h;
   ranks &ranks;
   run;
proc print;
   run;

enter image description here

宏EXPAND_VARLIST

%macro
   expand_varlist /*Returns an expanded variable list and optionally creates an indexed data set of variable names*/
      (
         data  = _LAST_,            /*[R]Input data*/
         var   = _ALL_,             /*[R]Variable List expanded*/
         where = 1,                 /*[R]Where clause to subset OUT=, useful for selecting by a name suffix e.g. where=_name_ like '%_Status'*/
         expr  = nliteral(&name),   /*[R]An expression that can be used to modify the names in the expanded list*/
         keep  = ,                  /*[O]Keep data set option for DATA=*/
         drop  = ,                  /*[O]Drop data set option for DATA=*/
         out   = ,                  /*[O]Output data indexed by _NAME_ and _INDEX_*/
         name  = _NAME_,            /*[R]Name of the variable name variable in the output data set*/
         label = _LABEL_,           /*[R]Name of the variable name label variable in the output data set*/
         index = _INDEX_,           /*[R]Name of the variable index variable in the output data set*/
         dlm   = ' '                /*[R]List delimiter*/
      );
   %local m i;
   %let i=&sysindex;
   %let m=&sysmacroname._&i;
   %do %while(%symexist(&m));
      %let i = %eval(&i + 1);
      %let m=&sysmacroname._&i;
      %end;
   %put NOTE: &=m is a unique symbol name;
   %local rc &m code1 code2 code3 code4;
   %if %superq(out) ne %then %let code3 = %str(data &out(index=(&index &name)); set &out; &index+1; run;);
   %else %do;
      %let out=%str(work._deleteme_);
      %let code3 = %str(proc delete data=work._deleteme_; run;);
      %end;
   %let code1 = %str(options notes=0; proc transpose name=&name label=&label data=&data(obs=0 keep=&keep drop=&drop) out=&out(where=(&where)); var &var; run;);
   %let code2 = %str(proc sql noprint; select &expr into :&m separated by &dlm from &out; quit;);
   %let code4 = %str(options notes=1;);
   %let rc=%sysfunc(dosubl(&code1 &code2 &code3 &code4));
&&&m.
   %mend expand_varlist;