我有3个数据集A,B,C
,其中包含以下变量
A:
period region city
B:
period city Sales
C:
period region Sales
我的目标是使用A
和B
在C
上进行左连接,以根据地理位置获取销售信息。我尝试按顺序执行:
/* Left joining B to A based on period and region */
proc sql;
Create table merge1 as
select l.* , r.* from
A as l left join B as r
on l.period = r.period and l.city=r.city;
quit;
/* Renaming Sales variable*/
data merged2;
set merge1;
rename Sales= s1;
run;
/*Doing another left join again, this time using C*/
proc sql;
create table merge3 as
select l.*,r.* from
A as l left join C as r
on l.period= r.period and l.region=r.region;
quit;
/*Replacing some of the values*/
data merge4;
set merge3;
Sales1= IFN(s1=., Sales, s1);
drop s1 Sales;
run;
我的问题是,如果有更好/更有效的方法来解决这个问题?特别是对于多个左连接,因为随着要匹配的数据集和可变数量的增加,该过程将变得非常繁琐,谢谢!
答案 0 :(得分:0)
您可以在单个SQL过程中执行此操作。由于您有多个表,因此您必须逐个加入它们。
proc sql;
Create table merge1 as select
A.* ,
B.sales as s1,
C.sales as s2,
coalesce(B.sales, C.sales) as Sales /*takes first non missing value*/
from A
left join B on (A.period = B.period and A.city = B.city)
left join C on (A.period = C.period and A.region = C.region);
quit;