我想结合以下两个表。最终表与表B类似。仅基于anchor_num和readm_num更改anchor和readmit的名称。
如果b.anchor_num = a.sprv_num则b.anchor_provider = a.sprv_full_name; 如果b.readm_num = a.sprv_num则b.readmit_provider = a.sprv_full_name;
Table A
SPRV_NUM SPRV_FULL_NAME
70010Q40 NFI MASSACHUSETTSINC
700122330 NORTHAMPTON VA MEDICAL CENTER
700122223 RADIUS SPECIALTY HOSPITAL LLCDBA
700122331 SHAUGHNESSY KAPLANREHAB HOSPINC
700122330 SPAULDING HOSPITALCAMBRIDGE INC
70010Q402 SPRING HILL RECOVERY CENTER INC
700122222 SPRINGFIELD PARK VIEW HOSPITALDBA
700122222 SPRINGFIELD PARK VIEW HOSPITALDBA
70010Q057 ST ANNES HOME INC
70010Q007 STAR OF RHODE ISLAND
Table B
anchor_num anchor readm_num readmit
700122224 Harrington memorial hospital 700122229 first psychiatricap inc
700122224 Harrington memorial hospital 700122224 Harrington memorial hospital
700122330 NORTHAMPTON VA MEDICAL 700122223 RADIUS SPECIALTY HOSPITAL
700122222 SPRINGFIELD PARK VIEW 700122222 SPRINGFIELD PARK VIEW
700122226 HENRY HEYWOOD HOSPITAL 70010Q402 SPRING HILL RECOVERY INC
70010Q057 ST ANNES HOME INC 70010Q057 ST ANNES HOME INC
谢谢, 简
答案 0 :(得分:0)
首先使用proc sort:
从表a中删除重复的sprv_numproc sort data=tablea out=tablea2 nodupkey;
by sprv_num;
run;
(nodupkey函数删除了您按排序的变量重复的观察结果。)
然后合并表:
proc sql;
create table final as
select b.anchor_num, coalesce(x.sprv_full_name, b.anchor) as anchor_provider,
b.readm_num, coalesce(y.sprv_full_name, b.readmit) as readmit_provider
from tableb as b
left join tablea2 as x on b.anchor_num = x.sprv_num
left join tablea2 as y on b.readm_num = y.sprv_num
order by anchor_num;
quit;
coalesce函数返回第一个非缺失值。这意味着如果表a没有给定anchor_num或readm_num的值,则该名称取自表b。我将表a合并两次,因为有两个不同的地方需要名称。