我正在尝试使用proc sql折叠我的数据。但是,我注意到当我试图折叠我的数据时,我丢失了一堆我想要保留的变量。我试图根据变量MRN(数字)折叠我的数据。我想要保留的其他变量是CITY和SITE(这些是字符值),并且这些变量对于每个唯一的MRN都是常量,因此折叠它们应该没问题。
以下是我正在使用的代码
proc sql;
create table collapsed_data as
select distinct mrn,
sum(msk_tx_yes) as msk_tx_yes,
sum(msk_cancel_tx_yes) as msk_cancel_tx_yes,
sum(msk_ca_yes) as msk_ca_yes,
sum(msk_cancel_ca_yes) as msk_cancel_ca_yes,
sum(msk_dc_yes) as msk_dc_yes,
sum(conc_psych_tx_yes) as conc_psych_tx_yes,
sum(conc_psych_ca_yes) as conc_psych_ca_yes,
sum (conc_psych_dc_yes) as conc_psych_dc_yes,
sum (conc_yes) as conc_yes,
sum (psych_yes) as psych_yes,
sum (foot_prog) as foot_prog,
sum (hand_prog) as hand_prog,
sum (surg_prog) as surg_prog,
sum (sx_yes) as sx_yes
from temp_collapsed_data
group by mrn;
quit;
我不确定如何一起使用SELECT和DISTINCT功能。
我想也许我可以在SELECT之后添加变量CITY和STATE,同时保持DISTINCT,但它确实可以工作。
我希望能够将CITY和STATE保留在新表中以及我正在制作的新求和变量中。如何在不将CITY和STATE转换为虚拟编码变量的情况下实现这一目标?如果可能,我想将它们保留为字符值。
任何人都知道如何实现这一目标?
答案 0 :(得分:0)
你的代码已经正确了。只需将变量添加到select语句中即可。
proc sql;
create table collapsed_data as
select distinct mrn, city, site,
sum(msk_tx_yes) as msk_tx_yes,
sum(msk_cancel_tx_yes) as msk_cancel_tx_yes,
sum(msk_ca_yes) as msk_ca_yes,
sum(msk_cancel_ca_yes) as msk_cancel_ca_yes,
sum(msk_dc_yes) as msk_dc_yes,
sum(conc_psych_tx_yes) as conc_psych_tx_yes,
sum(conc_psych_ca_yes) as conc_psych_ca_yes,
sum (conc_psych_dc_yes) as conc_psych_dc_yes,
sum (conc_yes) as conc_yes,
sum (psych_yes) as psych_yes,
sum (foot_prog) as foot_prog,
sum (hand_prog) as hand_prog,
sum (surg_prog) as surg_prog,
sum (sx_yes) as sx_yes
from temp_collapsed_data
group by mrn;
quit;
distinct语句将导致没有两行具有相同的信息。