我需要找到按用户和ID_option分组的最后3小时用法(用法是数据集中的一列)的每一行。 每行(行)代表一条记录(在3分钟的时间间隔内)。例如(包括所需的列sum_usage_3hr):
User ID_option time usage sum_usage_3hr
1 a1 12OCT2017:11:20:32 3 10
1 a1 12OCT2017:10:23:24 7 14
1 b1 12OCT2017:09:34:55 12 12
2 b1 12OCT2017:08:55:06 4 6
1 a1 12OCT2017:07:59:53 7 7
2 b1 12OCT2017:06:59:12 2 2
我在下面使用了哈希表代码:
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('user','id_option','time');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage_3hr=0;
do i=time-3*3600 to time ;
if h.find(key:user,key:id_option,key:i)=0 then sum_usage_3hr+_usage;
end;
drop _usage i;
run;
但是我收到一个错误:无效的DO循环控制信息,缺少INITIAL或TO表达式或者BY表达式缺失,零或无效。如果我添加: 输出; 结束: 就在"运行之上;"它给了我一个错误:'没有匹配的DO / Select语句'。 有谁知道导致问题的原因是什么? 我还有首先对表进行排序的版本并给出了同样的错误。 谢谢
实施答案后:
User ID_option time usage sum_usage_3hr col_i_got
1 a1 12OCT2017:11:22:32 3 12 3
1 a1 12OCT2017:11:20:24 0 9 3
1 a1 12OCT2017:10:34:55 2 9 2
1 a1 12OCT2017:09:55:06 0 7 2
1 a1 12OCT2017:09:43:45 0 7 0
1 a1 12OCT2017:08:59:53 7 7 7
1 a1 12OCT2017:06:59:12 0 0 7
答案 0 :(得分:1)
试试这个:
问题1:
输入:
data have;
input User ID_option $ time usage ;
informat time datetime18.;
format time datetime18.;
cards;
1 a1 12OCT2017:11:20:32 3
1 a1 12OCT2017:10:23:24 7
1 b1 12OCT2017:09:34:55 12
2 b1 12OCT2017:08:55:06 4
1 a1 12OCT2017:07:59:53 7
2 b1 12OCT2017:06:59:12 2
;
run;
代码:
proc sort data=have out=have1;
by user id_option time;
quit;
data have2;
set have1;
by user id_option;
format previous_time datetime18.;
previous_time = lag(time);
previous_usage = lag(usage);
if first.ID_option then previous_time=.;
if previous_time ~= . and intnx("hour",time,-3,"s") <= previous_time <= time then sum_usage_3hr=usage+previous_usage;
else sum_usage_3hr = usage;
drop previous_time previous_usage;
run;
proc sort data=have2 out=want;
by descending time ;
quit;
<强>输出:强>
User ID_option time usage sum_usage_3hr
1 a1 12Oct2017 11:20:32 3 10
1 a1 12Oct2017 10:23:24 7 14
1 b1 12Oct2017 9:34:55 12 12
2 b1 12Oct2017 8:55:06 4 6
1 a1 12Oct2017 7:59:53 7 7
2 b1 12Oct2017 6:59:12 2 2
<强> Problem2: 强>
输入:
data have;
input user1 ID_option $ time usage ;
informat time datetime18.;
format time datetime18.;
cards;
1 a1 12OCT2017:11:22:32 3
1 a1 12OCT2017:11:20:24 0
1 a1 12OCT2017:10:34:55 2
1 a1 12OCT2017:09:55:06 0
1 a1 12OCT2017:09:43:45 0
1 a1 12OCT2017:08:59:53 7
1 a1 12OCT2017:06:59:12 0
;
run;
代码:
proc sql;
create table want as
select user1,id_option,time,min(usage) as usage,sum(usage1) as sum_usage_3hr
from
(
select a.*,b.time as time1 ,b.usage as usage1
from
have a
left join
have b
on a.user1 = b.user1 and a.id_option = b.id_option and b.time <= a.time
where intck("hour",a.time ,b.time) >= -3
)
group by 1,2,3
order by time desc;
quit;
输出
user1 ID_option time usage sum_usage_3hr
1 a1 12Oct2017 11:22:32 3 12
1 a1 12Oct2017 11:20:24 0 9
1 a1 12Oct2017 10:34:55 2 9
1 a1 12Oct2017 9:55:06 0 7
1 a1 12Oct2017 9:43:45 0 7
1 a1 12Oct2017 8:59:53 7 7
1 a1 12Oct2017 6:59:12 0 0
如有任何疑问,请与我联系。