Question

我有一个类似下面的数据集。每行是一个不同的观察，有1到x的任何值（在这种情况下x = 3）。我想创建一个包含原始信息的数据集，但是还有四个额外的列（对于数据集中存在的Bin的四个值）。如果该行中存在任何1，则freq_Bin_1列的值将为1，否则将丢失。如果存在任何2，则列freq_Bin_2将为1

原始数据集中的分档数和列数可能会有所不同。

data have;
    input Bin_1 Bin_2 Bin_3;
cards;
1 . .
3 . .
1 1 .
3 2 1
3 4 .
;
run;

这是我想要的输出：

data want_this;
    input Bin_1 Bin_2 Bin_3 freq_Bin_1 freq_Bin_2 freq_Bin_3 freq_Bin_4;
cards;
1 . . 1 . . .
3 . . . . 1 .
1 1 . 1 . . .
3 2 1 1 1 1 .
3 4 . . . 1 1
;
run;

我有一个阵列解决方案，我认为非常接近，但我无法得到它。我也对其他方法持开放态度。

data want;
    set have;
    array Bins {&max_freq.} Bin:; 
    array freq_Bin {&num_bin.} freq_Bin_1-freq_Bin_&num_bin.;
    do j=1 to dim(Bins);
        freq_Bin(j)=.;
    end;
    do k=1 to dim(freq_Bin);
        if Bins(k)=1 then freq_Bin(1)=1;
        else if Bins(k)=2 then freq_Bin(2)=1;
        else if Bins(k)=3 then freq_Bin(3)=1;
        else if Bins(k)=4 then freq_Bin(4)=1;
    end;
    drop j k;
run;

Answer 1

这应该有效：

data want;
    set have;
    array Bins{*} Bin:; 
    array freq_Bin{4};
    do k=1 to dim(Bins);
        if Bins(k) ne . then freq_Bin(Bins(k))=1;
    end;
    drop k;
run;

我稍微调整了你的代码，但实际上唯一的问题是你需要在尝试使用它来索引数组之前检查Bins(k)是否没有丢失。此外，没有必要将值初始化为缺失值，因为这是默认值。

使用SAS检查列是否具有指定的特征

1 个答案: