Question

假设我将这些数据读入SAS：

我想列出每个唯一名称及其在上面数据中出现的月数，以提供如下数据集：

我已经研究过PROC FREQ，但我认为我需要在DATA步骤中执行此操作，因为我希望能够在新数据集中创建其他变量，否则就能够操纵新数据。 / p>

Answer 1

虽然可以在数据步骤中执行此操作，但您不会;您使用proc freq或类似内容。几乎每个PROC都可以为您提供输出数据集（而不仅仅是打印到屏幕上）。

PROC FREQ data=sashelp.class;
  tables age/out=age_counts noprint;
run;

然后，您可以将此输出数据集（age_counts）用作另一个数据步骤的SET输入，以执行进一步的计算。

Answer 2

您还可以使用proc sql对变量进行分组，并计算该组中的变量数量。它可能比proc freq更快，具体取决于您的数据量。

proc sql noprint;
    create table counts as
    select AGE, count(*) as AGE_CT from sashelp.class 
    group by AGE;
quit;

Answer 3

如果您想在数据步骤中执行此操作，可以使用哈希对象来保存计数值：

data have;
do i=1 to 100;
    do V = 'a', 'b', 'c';
        output;
    end;
end;
run;

data _null_;
set have end=last;
if _n_ = 1 then do;
    declare hash cnt();
    rc = cnt.definekey('v');
    rc = cnt.definedata('v','v_cnt');
    rc = cnt.definedone();
    call missing(v_cnt);
end;

rc = cnt.find();
if rc then do;
    v_cnt = 1;
    cnt.add();
end;
else do;
    v_cnt = v_cnt + 1;
    cnt.replace();
end;

if last then
    rc = cnt.output(dataset: "want");
run;

这非常有效，因为它是数据上的单个循环。 WANT数据集包含键值和计数值。

Answer 4

数据步骤：

proc sort data=have;
  by name month;
  run;

  data want;
     set have;
     by name month;
     m=month(lag(month));
     if first.id then months=1;
     else if month(date)^=m then months+1;
     if last.id then output;
     keep name months;
run;

Pro Sql：

proc sql;
   select distinct name,count(distinct(month(month))) as months from have group by name;
quit;

如何在SAS中提取变量的唯一值及其计数

4 个答案: