按间隔SAS EG滚动数字

时间:2017-03-28 18:31:02

标签: loops sas sas-macro

我在SAS中有一个数据集:

OBS  CAR       DATE_TIME
1    HON   JAN-01-17  13:00
2    HON   JAN-01-17  13:04
3    HON   JAN-01-17  13:06 
4    HON   JAN-01-17  13:15
5    HON   JAN-01-17  13:20
6    HON   JAN-01-17  13:29
7    TOY   JAN-01-17  13:05
8    TOY   JAN-01-17  13:10
9    TOY   JAN-01-17  13:39

数据代表汽车类型的事件时间戳。我试图计算特定汽车任何10分钟间隔的事件总数。目前,我正在通过添加另一行10分钟加上日期时间列然后将表连接到自己来实现。这是代码。

PROC SQL; CREATE TABLE WANT AS 
SELECT A.OBS,A.CAR,A.DATE_TIME,A.DATE_TIME+(10*60) AS ENDTM
COUNT(B.OBS) AS TOTAL
FROM HAVE A LEFT JOIN HAVE B ON A.CAR=B.CAR AND B.DATE_TIME BETWEEN A.DATE_TIME AND B.ENDTM
GROUP BY A.OBS,A.CAR;QUIT;

这是我得到的输出:

OBS  CAR       DATE_TIME       TOT
1    HON   JAN-01-17  13:00     3
2    HON   JAN-01-17  13:04     2
3    HON   JAN-01-17  13:06     2
4    HON   JAN-01-17  13:15     2
5    HON   JAN-01-17  13:20     2
6    HON   JAN-01-17  13:29     1
7    TOY   JAN-01-17  13:05     2
8    TOY   JAN-01-17  13:10     1
9    TOY   JAN-01-17  13:39     1

Is there a more efficient way to do it using Data step ?

Thanks
Jay

2 个答案:

答案 0 :(得分:1)

一个数据步骤选项是使用临时数组并将数据存储在您看到的数据中,然后检查数组中哪些元素仍然满足您的需求。我在这里以与你上面显示的方向相反的方向(我正在做'前10分钟')但你可以反向排序数据并按照你需要的方向进行(但是改变intck比较)。

data have;
input @1 OBS 1. @6 CAR $3. @12 DATE_TIME anydtdtm15.;
format date_time datetime17.;
datalines;
1    HON   JAN-01-17 13:00
2    HON   JAN-01-17 13:04
3    HON   JAN-01-17 13:06
4    HON   JAN-01-17 13:15
5    HON   JAN-01-17 13:20
6    HON   JAN-01-17 13:29
7    TOY   JAN-01-17 13:05
8    TOY   JAN-01-17 13:10
9    TOY   JAN-01-17 13:39
;;;;
run;

data want;
  set have;
  by car date_time;
  array prev_times[20] _temporary_;
  tot = 1;
  do _i = dim(prev_times) to 1 by -1 while (not missing(prev_times[_i]));
   if intck('minute',prev_times[_i], date_time) le 10 then do;
      tot = tot + 1;
    end;
    else do;
      call missing(prev_times[_i]);
    end;
  end;
  prev_times[_i] = date_time; 
  call sortn(of prev_times[*]);
  output;
  if last.car then call missing(of prev_times[*]);

run;

答案 1 :(得分:0)

不是数据步骤,但proc timeseries会为您完成。只需将您的日期转换为日期时间,并使用minute10.的间隔。

data have;
    input group$ date$ time$ tot;

    month = scan(date, 1, '-');
    day = scan(date, 2, '-');
    year = scan(date, 3, '-');

    datetime = input(cats(day, month, year, ':', time), datetime.);

    format datetime datetime.;

    datalines;
HON JAN-01-17 13:00 3
HON JAN-01-17 13:04 2
HON JAN-01-17 13:06 2
HON JAN-01-17 13:15 2
HON JAN-01-17 13:20 2
HON JAN-01-17 13:29 1
TOY JAN-01-17 13:05 2
TOY JAN-01-17 13:10 1
TOY JAN-01-17 13:39 1
;
run;

proc timeseries data=have out=want;
    by group;
    id datetime interval=minute10.;
    var tot / accumulate=total;
run;