我需要使用closet时间戳连接两个表。
data a;
input id name $5. timea time8.;
format timea time5.;
cards;
1 John 9:17
1 John 10:25
2 Chris 9:17
3 Emily 14:25
;run;
data b;
input id name $5. timea time8.;
format timeb time5.;
cards;
1 John 9:00
1 John 10:00
2 Chris 9:00
3 Emily 14:30
;run;
Table Want:
id name timea timeb
1 John 9:17 9:30
1 John 10:25 10:00
2 Chris 9:17 9:00
3 Emily 14:25 14:30
我的方法是在表b中构建一个key = id ||名称,按键排序,然后在表b中为每个时间戳创建一个区间。在下面的代码之后,我无法第一次看到John。
data time(rename=prev_TimeB = TimeB);
length start_time end_time 8;
retain start_time 0 prev_TimeB;
set B(keep=TimeB) end = last;
by key;
if not first.key then do;
end_time = TimeB - ((TimeB - prev_TimeB) / 2);
output;
prev_timeB = TimeB;
if last.key then do;
end_time = '23:59:59.999't;
output;
end;
format prev_timeB start_time end_time time12.3;
drop TimeB;
run;
感谢您的时间!
答案 0 :(得分:0)
找出差异是最小绝对差异的记录。更容易在SAS中编码,因为它会自动将聚合函数值与详细记录重新合并。
data a;
input id name :$5. timea :time8.;
format timea time5.;
cards;
1 John 9:17
1 John 10:25
2 Chris 9:17
3 Emily 14:25
4 Joe 11:21
;
data b;
input id name :$5. timeb time8.;
format timeb time5.;
cards;
1 John 9:00
1 John 10:00
2 Chris 9:00
3 Emily 14:30
;
proc sql ;
create table C as
select a.*
, timeb
, timea-timeb as seconds
, abs(calculated seconds) as distance
from a
left join b
on a.id = b.id and a.name = b.name
group by a.id,a.name,a.timea
having min(calculated distance) = calculated distance
;
quit;
结果
id name timea timeb seconds distance
1 John 9:17 9:00 1020 1020
1 John 10:25 10:00 1500 1500
2 Chris 9:17 9:00 1020 1020
3 Emily 14:25 14:30 -300 300
4 Joe 11:21 . . .
答案 1 :(得分:-1)
如果您已对数据集A和B进行了排序,则可以将临时变量pos = n 添加到两个表中:
Data a;
set a;
pos=_n_;
run;
Data b;
set b;
pos=_n_;
run;
您将拥有以下表格: id name timea pos id name timea pos 约翰一书9:17 1约翰福音9:00 1 约翰一书10:25 2约翰福音10:00 1 2 Chris 9:17 3 2克里斯9:00 3 3 Emily 14:25 4 3 Emily 14:30 4
然后你可以在proc sql语句中使用join
proc sql;
create table result as
select *
from a t1
left join b t2
on t1.pos=t2.pos;
quit;
如果数据集未排序 - 首先按正确的顺序排序