Question

我有一个按6个变量排序的数据集。

我想使用first.variable（在我的情况下是第六个变量）来设置数据集的新变量（第7个，第8个变量）的初始值。

示例：

if first.variable_name then do;
ratevalue = 999;
factor = 100.00;
end;

first.variable是groupby中的第6个变量。

该组中的第一列的日期值为＆＃39; 3-20-2017＆＃39;硬编码。因此，只有一组具有包含所有200K观测值的第一列。

问题是当我执行上面的代码时，我期望分配给观察的ratevalue和factor，其中first.variable_name =＆＃39; 1＆＃39;。

然而，这些值被分配给从first.variable开始的所有200k观察。

如果我使用

if last.variable_name then do;

'ratevalue = 999;'
factor = 100.00;
end;

然后它将上述值分配给从该组中第6个变量的最后一次观察开始的所有观察值。

它是如何工作的。

谢谢！

Answer 1

您应该能够获得所需的输出而不会出现问题。确保在分组之前对数据进行排序：您可以使用proc sort或notsorted语句中的BY。

请参阅下面的示例，其中包含7列，我按前6个分组：

data have;
input
  Period    Region $  city $ Sales1    Sales2    Sales3     Sales4     Sales5  total ;
  datalines;
   1         North     XY  1 2 2  2 2 30
   1         North     XY  1 2 2 2 2 40
   1         South     ZZ  1 1 1  2 2 100
   1         South     ZY  1 1 0 1 1 40
   ;
run;

data want;
set have;
by Period    Region   city  Sales1    Sales2    Sales3     Sales4     Sales5 notsorted;
if first.Sales5 = 1 then do; ratevalue=999; factor=100.00; end;
run;

输出：

 Period=1 Region=North city=XY Sales1=1 Sales2=2 Sales3=2 Sales4=2 Sales5=2 total=30 f_s5=1 l_s5=0 ratevalue=999 factor=100
 Period=1 Region=North city=XY Sales1=1 Sales2=2 Sales3=2 Sales4=2 Sales5=2 total=40 f_s5=0 l_s5=1 ratevalue=. factor=.
 Period=1 Region=South city=ZZ Sales1=1 Sales2=1 Sales3=1 Sales4=2 Sales5=2 total=100 f_s5=1 l_s5=1 ratevalue=999 factor=100
 Period=1 Region=South city=ZY Sales1=1 Sales2=1 Sales3=0 Sales4=1 Sales5=1 total=40 f_s5=1 l_s5=1 ratevalue=999

SAS如果first.variable然后分配值

1 个答案: