(SAS)在串联期间删除重复项

时间:2018-09-13 15:15:17

标签: sas

我正在使用以下数据步骤将多个观察值连接到一个变量中:

data Data_PreFinal;
set work.reasons;
by Number;
length Changes $4000.;
retain Changes;
if first.Number then Changes = EndoReason;
else Changes = catx(', ', Changes, EndoReason);
if last.Number then output;
run;

例如,我想确保数据集“原因”看起来像这样:

Number    EndoReason
1         Bucket1
1         Bucket2
1         Bucket1
1         Bucket3
1         Bucket2
1         Bucket2
2         Bucket2
2         Bucket2
2         Bucket1
2         Bucket2

结果数据集Data_PreFinal看起来像这样:

Number    EndoReason
1         Bucket1, Bucket2, Bucket3
2         Bucket2, Bucket1

而不是列出EndoReason变量中的所有重复值。

任何帮助将不胜感激!

谢谢!

1 个答案:

答案 0 :(得分:1)

朋友!,可能首先删除重复的观察值可能有用。例如:

data reasons;
    input Number EndoReason : $30.;
    datalines;
1         Bucket1
1         Bucket2
1         Bucket1
1         Bucket3
1         Bucket2
1         Bucket2
2         Bucket2
2         Bucket2
2         Bucket1
2         Bucket2
;

*Only eliminate duplicates;
proc sort data=reasons out=reasons_nodup nodup;
    by Number EndoReason;
run;

data Data_PreFinal;
    set work.reasons_nodup;
    by Number;
    length Changes $4000.;
    retain Changes;
    if first.Number then Changes = EndoReason;
    else Changes = catx(', ', Changes, EndoReason);
    if last.Number then output;

    drop EndoReason;
    rename Changes = EndoReason;
run;

祝你好运!