我有以下数据集:
Date jobboardid Sales
Jan05 3 256
Jan05 6 70
Jan05 54 90
Feb05 32 456
Feb05 11 89
Feb05 16 876
March05
April05
.
.
.
Jan06 6 678
Jan06 54 87
Jan06 13 56
Feb06 McDonald 67
Feb06 11 281
Feb06 16 876
March06
April06
.
.
.
Jan07 6 567
Jan07 54 76
Jan07 34 87
Feb07 10 678
Feb07 11 765
Feb07 16 67
March07
April06
我试图计算当jobboardid列相隔12个月时具有相同值时Sales列的12个月增长率。我有以下代码:
data Want;
set Have;
by Date jobboardid;
format From Till monyy7.;
from = lag12(Date);
oldsales = lag12(sales);
if lag12 (jobboardid) EQ jobboardid
and INTCK('month', from, Date) EQ 12 then do;
till = Date;
rate = (sales - oldsales) / oldsales;
output;
end;
run;
但是我一直收到以下错误消息: 注意:由于对缺失值执行操作而创建缺失值。
但是当我检查我的数据集时,没有任何缺失值。问题是什么?
注意:我的日期列在monyy7中。格式。 jobboardid是数值,Sales也是。
答案 0 :(得分:1)
INTCK()函数抛出NOTE。当你说from=lag12(date)
时,前12条记录的遗漏值将为。{1}}。然后INTCK('month', from, Date)
将抛出NOTE。即使INTCK没有在赋值语句中使用,它仍然会抛出NOTE,因为它的一个参数缺少值。以下是一个例子。日志报告缺失值创建了12次,因为我使用了lag12。
77 data have;
78 do Date=1 to 20;
79 output;
80 end;
81 run;
NOTE: The data set WORK.HAVE has 20 observations and 1 variables.
82 data want;
83 set have;
84 from=lag12(Date);
85 if intck('month',from,today())=. then put 'Missing: ' (_n_ Date)(=);
86 else put 'Not Missing: ' (_n_ Date)(=);
87 run;
Missing: _N_=1 Date=1
Missing: _N_=2 Date=2
Missing: _N_=3 Date=3
Missing: _N_=4 Date=4
Missing: _N_=5 Date=5
Missing: _N_=6 Date=6
Missing: _N_=7 Date=7
Missing: _N_=8 Date=8
Missing: _N_=9 Date=9
Missing: _N_=10 Date=10
Missing: _N_=11 Date=11
Missing: _N_=12 Date=12
Not Missing: _N_=13 Date=13
Not Missing: _N_=14 Date=14
Not Missing: _N_=15 Date=15
Not Missing: _N_=16 Date=16
Not Missing: _N_=17 Date=17
Not Missing: _N_=18 Date=18
Not Missing: _N_=19 Date=19
Not Missing: _N_=20 Date=20
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
12 at 85:6
NOTE: There were 20 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 20 observations and 2 variables.
避免此问题的一种方法是添加另一个类似(未经测试)的阻止程序:
if lag12 (jobboardid) EQ jobboardid and _n_> 12 then do;
if INTCK('month', from, Date) EQ 12 then do;
till = Date;
rate = (sales - oldsales) / oldsales;
output;
end;
end;