Question

我想提取数据集的所有第n行，然后提取每个后续的第33行，并将它们存储在新数据集中。

我有一个数据集，其中包含n次33个估算值，并且我希望将所有估算值a0提取到名为A0的数据集中，然后提取所有a1估算到数据集A1等，直到我有33个数据集。

我可以为每个元素执行此操作，但这需要大量代码，我想简化它。这是命名数据集的代码，然后将所有元素提取到其中。

data a0;
 set _parest1;
 if mod(_n_,33) = 1;
run;

这是一个特定问题，是更大问题的一部分。我有许多数据集包含34个估计参数（a0，a1 ... a33），我想取每个估算的平均值。

Answer 1

这是另一种方法：散列哈希方法。这很大程度上取自Paul Dorfman关于这个主题的论文，Data Step Hash Objects as Programming Tools。

data estimates;
  do id = 1 to 50;
    output;
  end;
run;

data _null_;
  if 0 then set estimates;
  length estimate 8;
  if _n_=1 then do;
    declare hash e ();
    declare hash h(ordered:'a');
    h.defineKey('estimate');
    h.defineData('estimate','e');  *this 'e' is the name of the other hash!;
    h.defineDone();
    declare hiter hi('h');
  end;
  do _n_ = 1 by 1 until (eof);
    set estimates end=eof;
    estimate = mod(_n_-1,5);
    rc_h = h.find();
    if rc_h ne 0 then do;
      e = _new_ hash(ordered:'a');
      e.defineKey('estimate','id');
      e.defineData('estimate','id');
      e.defineDone();
      h.replace(); 
    end;
    e.replace();
  end;

 do rc = hi.next () by 0 while ( rc = 0 ) ;
   e.output (dataset: cats('out',estimate)) ;
   rc = hi.next() ;
 end ; 
run;

这允许您任意输出任何特定数量的数据集，这很好。在这里你用33替换5并调整变量名称（'估计'是估计数字，我用MOD计算它，但也许你已经在数据集中了，而'id'当然是你的ID对于那一行 - a行号很好，_N_偶数 - 如果你有其他数据变量（你可能这样做），你可以将它们添加到e的defineData。

Answer 2

使用firstobs=数据集选项从第n条记录开始;

data want;
set have(firstobs=10);
if _n_ = 1 then output;
else if mod(_n_,33) = 1 then output;
run;

因此，要循环使用宏，请使用宏。例如：

data test;
do i=1 to 100;
output;
end;
run;

%macro loop_split(n,mod, ds, outPre);
%local i j;
%do i=1 %to &n;
   %let j=%eval(&i-1);
   data &outPre&j;
    set &ds(firstobs=&i);
    if _n_ = 1 then output;
    else if mod(_n_,33) = 1 then output;
   run;
%end;
%mend;

%loop_split(33,33,test,want);

我将n和mod值分开，因为它们不必相同，但在您的情况下。

提取数据集的每第n行

2 个答案: