Question

我有一个数据集ADDRESS，如下所示

data address;
    input fulladdress $char80.;
    datalines;
RM 101A TOWER 607,PALZZ     ,28 ABC ST     ,DISTRICT 14
FLAT 2426 24/F   ,KKL HSE   ,HAPPY ESTATE  ,DISTRICT 10
FLAT 08 18/F     ,BBC HOUSE ,CDEFG COURT   ,DISTRICT 9  , testingAdd5
;
run;

您可能会发现，对于每个观察，地址组件都由分隔符“，”分隔。因此，数组的维度是动态的（前两次观察为4次，最后一次观察为5次）。

当我试着的时候

data addressnew;
    set address;
    count = count(fulladdress,",") + 1;
    array add[5] $30.;
    do i = 1 to dim(add);
        add[i] = scan(fulladdress,i,",");
    end;
run;

我使用5作为array add我使用count()的维度来查找每行有多少个地址组件。我如何使用它来设置数组的维度？比如array[&count]？

根据@NEOman的回答，如果我不知道数组的维度，我可以使用add [*]。我得到以下错误

2252      array add[*] $30. ;
ERROR: The array add has been defined with zero elements.
2253      do i = 1 to count;
2254          add[i] = scan(fulladdress,i,",");
ERROR: Too many array subscripts specified for array add.

我想要的输出是 enter image description here

Answer 1

如果您不确定元素的数量，请使用array add[*]！

或者您可以定义_Temporary_数组，如下所示，尺寸大于元素数量，为了安全起见，我选择了100个。

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000201956.htm

data _null_;
set address;
count = count(fulladdress,",") + 1;
put count=;
array addn{0:999} $ _temporary_;
do i = 1 to count;
    addn[i] = scan(fulladdress,i,",");
    put addn[i]=;
end;

运行;

<强> EDIT1：据我了解您的问题，如果地址有六个段，您想要创建add1-add6个变量并将段存储在其中。

我尝试使用dynamic array进行此操作，但由于某种原因，我遇到了奇怪的错误。

data addressnew;
set address;
count = count(fulladdress,",") + 1;
put count=;

array addn[*] addn: ;
do i = 1 to count;
    addn[i] = scan(fulladdress,i,",");
    put addn[i]=;
end;

下面是 TESTED 代码，它可能不是最复杂的（编程明智，但我认为它不会对执行时间和空间有任何负面影响）的方式，但它是工作。希望有人会提出更简单的解决方案。

通过扫描整个数据集中的所有记录来选择最大段数。

data temp;
set address(keep=fulladdress);
countnew = count(fulladdress,",") + 1;
run;

proc sql noprint;
select max(countnew) into: count_seg from temp;
quit;

%put &count_seg.;

/使用ARRAY /

data _null_;
set address;
count = count(fulladdress,",") + 1;
put count=;

array add{%sysfunc(compress(&count_seg.))} $30.;
do i = 1 to count;
    add[i] = scan(fulladdress,i,",");
    put add[i]=;
end;
run;

/使用MACRO /

%macro test();
data _null_;
    set address;
    countnew = count(fulladdress,",") + 1;
       %do i = 1 %to &count_seg.;
        add&i. = scan(fulladdress,&i.,",");
        put add&i.=;
    %end;
run;

%mend;
%test;

Answer 2

数组引用SAS中的其他变量并且大小不是动态的。数组需要与元素列表一样大或大。每行将具有相同数量的变量，并且必要时最后一个变量将为空。您可以通过循环计数变量而不是数组的暗淡来使代码工作。

如果您在开始时不知道列表/数组的大小，则必须先找到它

  *EDIT: Here's a way to find the max size of the array first;

  data _null_;
    set address end=eof;
    retain max_count 0;
    count = count(fulladdress,",") + 1;
    if count>max_count then max_count=count;
    if eof then call symputx('array_size', max_count);
  run;

 data addressnew;
  set address;
  array add[&max_count.] $30.;
  count = count(fulladdress,",") + 1;

  do i = 1 to count;
    add[i] = scan(fulladdress,i,",");
  end;
 run;

阵列SAS的动态维度

2 个答案: