Question

我有一个类似于下面的数据集

ID  A   B  C  D  E
1   1
1          1
1       1
2       1
2             1
3                1
3   1
4          1
5       1

我想将每个ID的数据压缩成一行。所以数据集看起来就像下面那样。

ID  A   B  C  D  E
 1  1   1  1 
 2      1     1
 3  1            1
 4         1
 5      1

我创建了另一个表并删除了重复的ID。所以我有两个表--A和B.然后我尝试将两个数据集合并在一起。我正在玩以下SAS代码。

data C; 
     merge A B; 
     by ID;
run;

Answer 1

这是我从另一个论坛上获取的一个巧妙的技巧。无需拆分原始数据集，第一个更新语句创建结构，第二个更新值。 BY语句确保每个ID只能获得1条记录。

data have;
infile datalines dsd;
input ID  A   B  C  D  E;
datalines;
1,1,,,,,
1,,,1,,,
1,,1,,,,
2,,1,,,,
2,,,,1,,
3,,,,,1,
3,1,,,,,
4,,,1,,,
5,,1,,,
;
run;

data want;
update have (obs=0) have;
by id;
run;

Answer 2

这可以使用retain语句来解决。

data B(rename=(A2=A B2=B C2=C D2=D));
  set A;
  by id;
  retain A2 B2 C2 D2;
  if first.id then do;
    A2 = .;
    B2 = .;
    C2 = .;
    D2 = .;
  end;
  if A ne . then A2=A;
  if B ne . then B2=B;
  if C ne . then C2=C;
  if D ne . then D2=D;
  if last.id then output;
  drop A B C D;
run;

还有其他方法可以解决这个问题，但希望这会有所帮助。

Answer 3

PROC MEANS是一个很棒的工具。 PROC SQL也会为您提供合理的解决方案，但MEANS更快。

proc means data=yourdata;
var a b c d e;
class id;
types id; *to avoid the 'overall' row;
output out=yourdata max=; *output the maximum of each var for each ID - use SUM instead if you want more than 1;
run;

SAS合并/压缩数据

3 个答案: