需要帮助来理清这种情况,
表
pr_id lob_id prec1 prec2 prec3
112 1a 3478 56 77
112 1b 3466 65 43
112 1c 5677 57 68
112 1d 5634 49 52
215 2a 1234 43 45
215 2b 9787 32 43
215 2c 4566 39 90
388 3a 8797 88 99
388 3b 6579 58 72
388 3c 9087 76 67
必需输出:需要在pr_id和各个lob_id观察行中对不同的pr_id行进行不同的观察。如下图所示
pr_id lob_id prec1 prec2 prec3 lob_id prec1 prec2 prec3 lob_id prec1 prec2 prec3 lob_id prec1 prec2 prec3
112 1a 3478 56 77 1b 3466 65 43 1c 5677 57 68 1d 5634 49 52
215 2a 1234 43 45 2b 9787 32 43 2c 4566 39 90 . . . .
388 3a 8797 88 99 3b 6579 58 72 3c 9087 76 67 . . . .
我尝试过使用proc转置,但变量名称与所需输出不同,请你帮忙解决一下。
谢谢。
答案 0 :(得分:1)
这将尽可能接近你想要的答案。它比可能需要的更复杂,但它确实确保了lob_id与它们的prec1-3保持一致。您不能为多个变量使用相同的变量名,但是您可以使用相同的标签,因此在添加_1 _2 _3等时保持标签相同。
然后你可以PROC打印数据集,如果你想在输出窗口中这样做(那应该显示标签,从而在输出中得到你想要的重复变量名)。
data have;
input pr_id lob_id $ prec1 prec2 prec3;
datalines;
112 1a 3478 56 77
112 1b 3466 65 43
112 1c 5677 57 68
112 1d 5634 49 52
215 2a 1234 43 45
215 2b 9787 32 43
215 2c 4566 39 90
388 3a 8797 88 99
388 3b 6579 58 72
388 3c 9087 76 67
;;;;
run;
data have_pret;
set have;
by pr_id;
array precs prec:;
if first.pr_id then counter=0;
counter+1;
varnamecounter+1;
valuet=lob_id;
idname=cats("lob_id",'_',counter);
idlabel="lob_id";
output;
call missing(valuet);
do __t = 1 to dim(precs);
varnamecounter+1;
valuen=precs[__t];
idname=cats('prec',__t,'_',counter);
idlabel=vlabel(precs[__t]);
output;
end;
call missing(valuen);
keep pr_id valuet valuen idname idlabel varnamecounter;
run;
proc sort data=have_pret out=varcounter(keep=idname varnamecounter);
by idname varnamecounter;
run;
data varcounter_fin;
set varcounter;
by idname varnamecounter;
if first.idname;
run;
proc sql;
select idname into :varlist separated by ' '
from varcounter_fin order by varnamecounter;
quit;
proc transpose data=have_pret(where=(not missing(valuen))) out=want_n;
by pr_id;
var valuen;
id idname;
idlabel idlabel;
run;
proc transpose data=have_pret(where=(missing(valuen))) out=want_t;
by pr_id;
var valuet;
id idname;
idlabel idlabel;
run;
data want;
retain pr_id &varlist.;
merge want_n want_t;
by pr_id;
drop _name_;
run;
在SQL中执行此操作令人恼火; SAS不支持高级SQL表函数,这些函数允许您巧妙地转置它而无需对所有内容进行硬编码。这就像是
proc sql;
select pr_id,
max(lob_id1) as lob_id1, max(prec1_1) as prec1_1, max(prec2_1) as prec2_1, max(prec3_1) as prec3_1,
max(lob_id2) as lob_id2, max(prec1_2) as prec1_2, max(prec2_2) as prec2_2, max(prec3_2) as prec3_2 from (
select pr_id,
case when substr(lob_id,2,1)='a' then lob_id else ' ' end as lob_id1,
case when substr(lob_id,2,1)='a' then prec1 else . end as prec1_1,
case when substr(lob_id,2,1)='a' then prec2 else . end as prec2_1,
case when substr(lob_id,2,1)='a' then prec3 else . end as prec3_1,
case when substr(lob_id,2,1)='b' then lob_id else ' ' end as lob_id2,
case when substr(lob_id,2,1)='b' then prec1 else . end as prec1_2,
case when substr(lob_id,2,1)='b' then prec2 else . end as prec2_2,
case when substr(lob_id,2,1)='b' then prec3 else . end as prec3_2
from have )
group by pr_id;
quit;
但扩展到包含3和4.您可以看到为什么在SQL中执行此操作很愚蠢我希望:) SAS代码可能实际上更短,并且正在做更多的工作以使其易于扩展 - 您可以跳过一半例如,如果你刚刚硬编码了保留语句,那就是它。