根据存储在另一个表SAS中的条件对变量进行分类

时间:2018-02-02 12:59:10

标签: sas

我想从一个表中对变量进行分类,如下所示:

Var1 Var2
19   0.2
30   0.1
45   0.2

使用存储分类条件的表

variable condition   category
Var1     Var1<20         1
Var1     40>Var1>=20     2
Var1     Var1>=40        3
Var2     Var2<0.2        1
Var2     Var2>=0.2       2

结果将是一个新表,其中包含基于第一个表的变量类别:

Var1 Var2
1     2
2     1
3     2

2 个答案:

答案 0 :(得分:1)

这是实现此目的的一种宏观方式。它假定表中的条件按照您希望它们应用的顺序并按变量分组。如果没有,则适当地对表进行排序。

首先测试数据:

Log.aggregate([{
     $match: {
         createdAt: {
             $gte: new Date(strFrom),             
             $lte: new Date(strTo),             
         }
     }, 
     {  <-- missing brace around your group           
        $group: { _id: "$userId" },            
     }], 
     function (err, logs) {
        if (err) {
           res.status(500).send({ message: "error retrieving logs." });
        } else {
           res.send(logs);
     }
});

现在制作一个宏。我们将把表读入宏变量,然后编写一个datastep来应用它们。我们对每个变量使用IF / THEN / ELSE块。

data have;
input Var1 Var2;
datalines;
19   0.2
30   0.1
45   0.2
;

data conditions;
informat variable condition $32.;
input variable $ condition $  category;

datalines;
Var1     Var1<20         1
Var1     40>Var1>=20     2
Var1     Var1>=40        3
Var2     Var2<0.2        1
Var2     Var2>=0.2       2
;

最后运行宏。使用%macro apply_conditions(); %local i j n; proc sql noprint; select count(*) into :n trimmed from conditions; %do i=1 %to &n; %local var&i; %local condition&i; %local category&i; %end; select variable, condition, category into :var1 - :var&n, :condition1 - :condition&n, :category1 - :category&n from conditions; quit; data want; set have; %do i=1 %to &n; /*If the variable changes, then don't add the ELSE */ %if &i>1 %then %do; %let j=%eval(&i-1); %if &&var&i = &&var&j %then %do; else %end; %end; /*apply the condition*/ if &&condition&i then &&var&i = &&category&i; %end; run; %mend; 查看生成的代码。

MPRINT

答案 1 :(得分:1)

这只是前一个问题的重复。 Categorize variables basing on conditions from other data set

如果您只使用SA代码来创建和调试数据代码更容易创建和调试,而不是添加宏代码的复杂性。

以下是更详细的答案。首先,让我们将您的示例数据打印输出到实际的SAS数据集中。

data rawdata ;
  input Var1 Var2;
cards;
19   0.2
30   0.1
45   0.2
;

data metadata ;
  input variable :$32. condition :$200. category ;
cards;
Var1     Var1<20         1
Var1     40>Var1>=20     2
Var1     Var1>=40        3
Var2     Var2<0.2        1
Var2     Var2>=0.2       2
;

现在让我们生成一个带有CASE语句的SQL select语句,从元数据中生成每个输出变量。

filename code temp;
data _null_;
  set metadata end=eof;
  by variable ;
  file code ;
  retain sep ' ';
  if _n_=1 then put "create table want as select";
  if first.variable then put sep $1. 'case ';
  put '  when (' condition ') then ' category ;
  if last.variable then put '  else . end as ' variable ;
  if eof then put 'from rawdata' / ';' ;
  sep=',' ;
run;

然后运行它。

proc sql;
%include code / source2 ;
quit;

示例SAS LOG:

1639  proc sql;
1640  %include code / source2 ;
NOTE: %INCLUDE (level 1) file CODE is file C:\Users\xxx\AppData\Local\Temp\1\SAS Temporary Files\_TD13724_AMRL20B7F00CGPP_\#LN00654.
1641 +create table want as select
1642 + case
1643 +  when (Var1<20 ) then 1
1644 +  when (40>Var1>=20 ) then 2
1645 +  when (Var1>=40 ) then 3
1646 + else . end as Var1
1647 +,case
1648 +  when (Var2<0.2 ) then 1
1649 +  when (Var2>=0.2 ) then 2
1650 + else . end as Var2
1651 +from rawdata
1652 +;
NOTE: Table WORK.WANT created, with 3 rows and 2 columns.

结果:

Obs    Var1    Var2

 1       1       2
 2       2       1
 3       3       2

如果要将其转换为宏,则只需用宏变量引用替换硬编码输入数据集名称和输出数据集名称。

%macro gencat(indata=,outdata=,metadata=metadata);

filename code temp;
data _null_;
  set &metadata end=eof;
  by variable ;
  file code ;
  retain sep ' ';
  if _n_=1 then put "create table &outdata as select";
  if first.variable then put sep $1. 'case ';
  put '  when (' condition ') then ' category ;
  if last.variable then put ' else . end as ' variable ;
  if eof then put "from &indata" / ';' ;
  sep=',' ;
run;

proc sql;
%include code / nosource2 ;
quit;

%mend gencat;

所以现在通过调用这些值获得相同的结果:

%gencat(indata=rawdata,outdata=want)

所以日志现在看起来像这样:

1783  %gencat(indata=rawdata,outdata=want)
MPRINT(GENCAT):   filename code temp;
NOTE: PROCEDURE SQL used (Total process time):
      real time           10.35 seconds
      cpu time            0.20 seconds


MPRINT(GENCAT):   data _null_;
MPRINT(GENCAT):   set metadata end=eof;
MPRINT(GENCAT):   by variable ;
MPRINT(GENCAT):   file code ;
MPRINT(GENCAT):   retain sep ' ';
MPRINT(GENCAT):   if _n_=1 then put "create table want as select";
MPRINT(GENCAT):   if first.variable then put sep $1. 'case ';
MPRINT(GENCAT):   put '  when (' condition ') then ' category ;
MPRINT(GENCAT):   if last.variable then put ' else . end as ' variable ;
MPRINT(GENCAT):   if eof then put "from rawdata" / ';' ;
MPRINT(GENCAT):   sep=',' ;
MPRINT(GENCAT):   run;

NOTE: The file CODE is:
      Filename=C:\Users\AppData\Local\Temp\1\SAS Temporary Files\_TD13724_AMRL20B7F00CGPP_\#LN00659,
      RECFM=V,LRECL=32767,File Size (bytes)=0,
      Last Modified=02Feb2018:12:36:39,
      Create Time=02Feb2018:12:36:39

NOTE: 12 records were written to the file CODE.
      The minimum record length was 1.
      The maximum record length was 28.
NOTE: There were 5 observations read from the data set WORK.METADATA.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

MPRINT(GENCAT):   proc sql;
MPRINT(GENCAT):   create table want as select case when (Var1<20 ) then 1 when (40>Var1>=20 ) then 2 when (Var1>=40 ) then 3 else .
end as Var1 ,case when (Var2<0.2 ) then 1 when (Var2>=0.2 ) then 2 else . end as Var2 from rawdata ;
NOTE: Table WORK.WANT created, with 3 rows and 2 columns.

MPRINT(GENCAT):   quit;