Question

对于熟悉阵列编程的SAS用户，我知道SAS问题可能相当简单，但我是这方面的新手。

我的数据集如下所示：

Data have;          
Input group $ size price;
Datalines;
A 24 5
A 28 10
A 30 14
A 32 16
B 26 10
B 28 12
B 32 13
C 10 100
C 11 130
C 12 140
;
Run;

我想要做的是确定家庭中前两个项目的价格变化率，并将该比率应用于家庭中的每个其他成员。

所以，我最终会得到一些看起来像这样的东西（仅限A ......）：

Data want;         
Input group $ size price newprice;
Datalines;
A 24 5 5 
A 28 10 10
A 30 14 12.5
A 32 16 15
;
Run;

Answer 1

您需要学习的技术是保留或差异/滞后。这两种方法都适用于此。

以下说明了解决此问题的一种方法，但是您需要额外的工作来处理大小不变（意味着0分母）和其他潜在异常等事情。

基本上，我们使用retain来使值在记录中保持不变，并在计算中使用它。

data want;
  set have;
  by group;
  retain lastprice rateprice lastsize;
  if first.group then do;
    counter=0;
    call missing(of lastprice rateprice lastsize); *clear these out;
  end;
  counter+1;                                       *Increment the counter;
  if counter=2 then do;
    rateprice=(price-lastprice)/(size-lastsize);   *Calculate the rate over 2;
  end;
  if counter le 2 then newprice=price;             *For the first two just move price into newprice;
  else if counter>2 then newprice=lastprice+(size-lastsize)*rateprice; *Else set it to the change;
  output;
  lastprice=newprice;        *save the price and size in the retained vars;
  lastsize=size;
run;

Answer 2

这里有一种明显比Joe更长的不同方法，但可以推广到计算不同或取决于更多值的其他类似情况。

为您的数据集添加序列号：

data have2;
  set have;
  by group;
  if first.group the seq = 0;
  seq + 1;
run;

使用proc reg计算每组前两行的截距和斜率，并使用outest输出估算值：

proc reg data=have2 outest=est;
  by group;
  model price = size;
  where seq le 2;
run;

将原始表连接到参数估计值并计算预测值：

proc sql;
create table want as
select
  h.*,
  e.intercept + h.size * e.size as newprice
from
  have h
  left join est e
  on h.group = e.group
order by
  group,
  size
;
quit;

确定不同组的变化率

2 个答案: