我有一个数据集ADDRESS,如下所示
data address;
input fulladdress $char80.;
datalines;
RM 101A TOWER 607,PALZZ ,28 ABC ST ,DISTRICT 14
FLAT 2426 24/F ,KKL HSE ,HAPPY ESTATE ,DISTRICT 10
FLAT 08 18/F ,BBC HOUSE ,CDEFG COURT ,DISTRICT 9 , testingAdd5
;
run;
您可能会发现,对于每个观察,地址组件都由分隔符“,”分隔。 因此,数组的维度是动态的(前两次观察为4次,最后一次观察为5次)。
当我试着的时候
data addressnew;
set address;
count = count(fulladdress,",") + 1;
array add[5] $30.;
do i = 1 to dim(add);
add[i] = scan(fulladdress,i,",");
end;
run;
我使用5作为array add
我使用count()
的维度来查找每行有多少个地址组件。我如何使用它来设置数组的维度?比如array[&count]
?
根据@NEOman的回答,如果我不知道数组的维度,我可以使用add [*]。我得到以下错误
2252 array add[*] $30. ;
ERROR: The array add has been defined with zero elements.
2253 do i = 1 to count;
2254 add[i] = scan(fulladdress,i,",");
ERROR: Too many array subscripts specified for array add.
我想要的输出是
答案 0 :(得分:2)
如果您不确定元素的数量,请使用array add[*]
!
或者您可以定义_Temporary_
数组,如下所示,尺寸大于元素数量,为了安全起见,我选择了100个。
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000201956.htm
data _null_;
set address;
count = count(fulladdress,",") + 1;
put count=;
array addn{0:999} $ _temporary_;
do i = 1 to count;
addn[i] = scan(fulladdress,i,",");
put addn[i]=;
end;
运行;
<强> EDIT1:强>
据我了解您的问题,如果地址有六个段,您想要创建add1-add6
个变量并将段存储在其中。
我尝试使用dynamic array
进行此操作,但由于某种原因,我遇到了奇怪的错误。
data addressnew;
set address;
count = count(fulladdress,",") + 1;
put count=;
array addn[*] addn: ;
do i = 1 to count;
addn[i] = scan(fulladdress,i,",");
put addn[i]=;
end;
下面是 TESTED 代码,它可能不是最复杂的(编程明智,但我认为它不会对执行时间和空间有任何负面影响)的方式,但它是工作。希望有人会提出更简单的解决方案。
通过扫描整个数据集中的所有记录来选择最大段数。
data temp;
set address(keep=fulladdress);
countnew = count(fulladdress,",") + 1;
run;
proc sql noprint;
select max(countnew) into: count_seg from temp;
quit;
%put &count_seg.;
/使用ARRAY /
data _null_;
set address;
count = count(fulladdress,",") + 1;
put count=;
array add{%sysfunc(compress(&count_seg.))} $30.;
do i = 1 to count;
add[i] = scan(fulladdress,i,",");
put add[i]=;
end;
run;
/使用MACRO /
%macro test();
data _null_;
set address;
countnew = count(fulladdress,",") + 1;
%do i = 1 %to &count_seg.;
add&i. = scan(fulladdress,&i.,",");
put add&i.=;
%end;
run;
%mend;
%test;
答案 1 :(得分:2)
数组引用SAS中的其他变量并且大小不是动态的。数组需要与元素列表一样大或大。每行将具有相同数量的变量,并且必要时最后一个变量将为空。您可以通过循环计数变量而不是数组的暗淡来使代码工作。
如果您在开始时不知道列表/数组的大小,则必须先找到它
*EDIT: Here's a way to find the max size of the array first;
data _null_;
set address end=eof;
retain max_count 0;
count = count(fulladdress,",") + 1;
if count>max_count then max_count=count;
if eof then call symputx('array_size', max_count);
run;
data addressnew;
set address;
array add[&max_count.] $30.;
count = count(fulladdress,",") + 1;
do i = 1 to count;
add[i] = scan(fulladdress,i,",");
end;
run;