Question

案例1

假设数据按年份按月排序（数据中总是有3次观察）。

Year Month Index
2014 11    1.1
2014 12    1.5
2015 1     1.2

我需要将上个月的Index复制到新的观察

Year Month Index
2014 11    1.1
2014 12    1.5
2015 1     1.2
2015 2     1.2

案例2

Year已从数据中删除。所以我们只有Month和Index。

Month Index
1     1.2
11    1.1
12    1.5

始终从连续3个月收集数据。所以1是最后一个月。

仍然是理想的输出

Month Index
1     1.2
2     1.2
11    1.1
12    1.5

我通过创建另一个仅包含Month（1,2 ... 12）的数据集来解决它。然后右键连接原始数据集两次。但我认为有更优雅的方式来解决这个问题。

Answer 1

案例1可以是直接的数据步骤。将end=eof添加到set语句以初始化变量eof，当数据步骤读取数据集的最后一行时，该变量返回值1。数据步骤中的输出语句在每次迭代期间输出一行。如果eof = 1，则运行do块，将月份增加1并输出另一行。

data want;
  set have end=eof;
  output;
  if eof then do;
    month=mod(month+1,12);
    output;
  end;
run;

对于案例2，我将切换到sql解决方案。自己在月份将表连接到自身，在第二个表中增加1。使用coalesce函数保留现有表中的值（如果存在）。如果不是，请使用第二个表中的值。由于跨越12月至1月的案例将产生5个月，因此使用outobs=中的proc sql选项将输出限制为四行，以排除不需要的第二个月。

proc sql outobs=4;
create table want as
select
  coalesce(t1.month,mod(t2.month+1,12)) as month,
  coalesce(t1.index,t2.index) as index
from
  have t1
  full outer join have t2
  on t1.month = t2.month+1
order by
  coalesce(t1.month,t2.month+1)
;
quit;

使用数据步骤生成下一个观察

1 个答案: