我正在使用以下数据步骤将多个观察值连接到一个变量中:
data Data_PreFinal;
set work.reasons;
by Number;
length Changes $4000.;
retain Changes;
if first.Number then Changes = EndoReason;
else Changes = catx(', ', Changes, EndoReason);
if last.Number then output;
run;
例如,我想确保数据集“原因”看起来像这样:
Number EndoReason
1 Bucket1
1 Bucket2
1 Bucket1
1 Bucket3
1 Bucket2
1 Bucket2
2 Bucket2
2 Bucket2
2 Bucket1
2 Bucket2
结果数据集Data_PreFinal看起来像这样:
Number EndoReason
1 Bucket1, Bucket2, Bucket3
2 Bucket2, Bucket1
而不是列出EndoReason变量中的所有重复值。
任何帮助将不胜感激!
谢谢!
答案 0 :(得分:1)
朋友!,可能首先删除重复的观察值可能有用。例如:
data reasons;
input Number EndoReason : $30.;
datalines;
1 Bucket1
1 Bucket2
1 Bucket1
1 Bucket3
1 Bucket2
1 Bucket2
2 Bucket2
2 Bucket2
2 Bucket1
2 Bucket2
;
*Only eliminate duplicates;
proc sort data=reasons out=reasons_nodup nodup;
by Number EndoReason;
run;
data Data_PreFinal;
set work.reasons_nodup;
by Number;
length Changes $4000.;
retain Changes;
if first.Number then Changes = EndoReason;
else Changes = catx(', ', Changes, EndoReason);
if last.Number then output;
drop EndoReason;
rename Changes = EndoReason;
run;
祝你好运!