我在SAS中有一个数据集:
OBS CAR DATE_TIME
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
数据代表汽车类型的事件时间戳。我试图计算特定汽车任何10分钟间隔的事件总数。目前,我正在通过添加另一行10分钟加上日期时间列然后将表连接到自己来实现。这是代码。
PROC SQL; CREATE TABLE WANT AS
SELECT A.OBS,A.CAR,A.DATE_TIME,A.DATE_TIME+(10*60) AS ENDTM
COUNT(B.OBS) AS TOTAL
FROM HAVE A LEFT JOIN HAVE B ON A.CAR=B.CAR AND B.DATE_TIME BETWEEN A.DATE_TIME AND B.ENDTM
GROUP BY A.OBS,A.CAR;QUIT;
这是我得到的输出:
OBS CAR DATE_TIME TOT
1 HON JAN-01-17 13:00 3
2 HON JAN-01-17 13:04 2
3 HON JAN-01-17 13:06 2
4 HON JAN-01-17 13:15 2
5 HON JAN-01-17 13:20 2
6 HON JAN-01-17 13:29 1
7 TOY JAN-01-17 13:05 2
8 TOY JAN-01-17 13:10 1
9 TOY JAN-01-17 13:39 1
Is there a more efficient way to do it using Data step ?
Thanks
Jay
答案 0 :(得分:1)
一个数据步骤选项是使用临时数组并将数据存储在您看到的数据中,然后检查数组中哪些元素仍然满足您的需求。我在这里以与你上面显示的方向相反的方向(我正在做'前10分钟')但你可以反向排序数据并按照你需要的方向进行(但是改变intck
比较)。
data have;
input @1 OBS 1. @6 CAR $3. @12 DATE_TIME anydtdtm15.;
format date_time datetime17.;
datalines;
1 HON JAN-01-17 13:00
2 HON JAN-01-17 13:04
3 HON JAN-01-17 13:06
4 HON JAN-01-17 13:15
5 HON JAN-01-17 13:20
6 HON JAN-01-17 13:29
7 TOY JAN-01-17 13:05
8 TOY JAN-01-17 13:10
9 TOY JAN-01-17 13:39
;;;;
run;
data want;
set have;
by car date_time;
array prev_times[20] _temporary_;
tot = 1;
do _i = dim(prev_times) to 1 by -1 while (not missing(prev_times[_i]));
if intck('minute',prev_times[_i], date_time) le 10 then do;
tot = tot + 1;
end;
else do;
call missing(prev_times[_i]);
end;
end;
prev_times[_i] = date_time;
call sortn(of prev_times[*]);
output;
if last.car then call missing(of prev_times[*]);
run;
答案 1 :(得分:0)
不是数据步骤,但proc timeseries
会为您完成。只需将您的日期转换为日期时间,并使用minute10.
的间隔。
data have;
input group$ date$ time$ tot;
month = scan(date, 1, '-');
day = scan(date, 2, '-');
year = scan(date, 3, '-');
datetime = input(cats(day, month, year, ':', time), datetime.);
format datetime datetime.;
datalines;
HON JAN-01-17 13:00 3
HON JAN-01-17 13:04 2
HON JAN-01-17 13:06 2
HON JAN-01-17 13:15 2
HON JAN-01-17 13:20 2
HON JAN-01-17 13:29 1
TOY JAN-01-17 13:05 2
TOY JAN-01-17 13:10 1
TOY JAN-01-17 13:39 1
;
run;
proc timeseries data=have out=want;
by group;
id datetime interval=minute10.;
var tot / accumulate=total;
run;