I have two datasets and only want them to merge if the meet a certain condition within both sets.
The code below is not correct.
data cohrt2_base_pre_&m2. cohrt2_base_a4lmig_&m2.;
length Base_indicator_&m2. $20.;
merge if pre_cb_&m3. = pre_cb_&m4. then do;
custon.cohrt_base_pre_&m1. (in=a)
VALIAT_B.STATPST_M1_&m2. (in=b drop=base
(rename=(priceplan=Pre_priceplan_&m4.)));
by kit_sim msisdn
end;
答案 0 :(得分:1)
PROC SQL可以这样做:
proc sql ; create table want (drop=base) as select a.*, b.*, b.priceplan as pre_priceplan_&m4 from custon.cohrt_base_pre_&m1 a inner join VALIAT_B.STATPST_M1_&m2 b on a.kit_sim = b.kit_sim and a.msisdn = b.msisdn and pre_cb_&m3 = pre_cb_&m4 ; quit ;
答案 1 :(得分:0)
因此pre_cb_& m3和pre_cb& m4存在于两个数据集中,您想过滤它们吗?如果是这样,那么你可以添加一个where选项,类似于(未经测试的):
data cohrt2_base_pre_&m2.
cohrt2_base_a4lmig_&m2.
;
length Base_indicator_&m2. $20.;
merge
custon.cohrt_base_pre_&m1. (where=(pre_cb_&m3. = pre_cb_&m4.)
in=a
)
VALIAT_B.STATPST_M1_&m2. (where=(pre_cb_&m3. = pre_cb_&m4.)
drop=base
rename=(priceplan=Pre_priceplan_&m4.)
in=b
);
by kit_sim msisdn;
run;
如果两个数据集中都存在变量,则会在合并中发生碰撞,这通常是一个坏主意。另一个想法是创建每个数据集的视图,该数据集执行子集化,然后合并两个视图。这应该可以让你避免碰撞。