SAS:根据减少的月份获取数字列表

时间:2018-07-20 03:44:26

标签: sas enterprise-guide

我有这个数据

    data have;
    input cust_id pmt months;
    datalines;
    AA 100 0
    AA 50 1
    AA 200 2
    AA 350 3
    AA 150 4
    AA 700 5
    BB 500 0
    BB 300 1
    BB 1000 2
    BB 800 3
    run;

我想生成一个看起来像这样的输出

    data want;
    input cust_id pmt months i;
    datalines;
    AA 100 0 0
    AA 50 0 1
    AA 200 0 2
    AA 350 0 3
    AA 150 0 4
    AA 700 0 5
    AA 50 1 0
    AA 200 1 1
    AA 350 1 2
    AA 150 1 3
    AA 700 1 4
    AA 200 2 0
    AA 350 2 1
    AA 150 2 2
    AA 700 2 3
    AA 350 3 0
    AA 150 3 1
    AA 700 3 2
    AA 150 4 0
    AA 700 4 1
    AA 700 5 0
    BB 500 0 0
    BB 300 0 1
    BB 1000 0 2
    BB 800 0 3
    BB 300 1 0
    BB 1000 1 1
    BB 800 1 2
    BB 1000 2 0
    BB 800 2 1
    BB 800 3 0
    run;

几千行具有不同的cust_ID和不同的months长度。我尝试加入表,但无法获得100 50 200 350 150 700的序列(用于cust_ID AA)。如果我的months为0,则只能复制100;如果months为1,则只能复制50,依此类推。我创建了一个maxval,它是最大月份值。我的代码是这样的

    data temp1;
    set have;
    do i = 0 to maxval;
    if (months <=maxval) then output;
    end;

我想创建一个唯一键来连接我的have data和temp1数据,但这只能给我

    AA 100 0 0
    AA 50 0 1
    AA 200 0 2
    AA 350 0 3
    AA 150 0 4
    AA 700 0 5
    AA 100 1 0
    AA 50 1 1
    AA 200 1 2
    AA 350 1 3
    AA 150 1 4
    AA 100 2 0
    AA 50 2 1
    AA 200 2 2
    AA 350 2 3
    AA 100 3 0
    AA 50 3 1
    AA 200 3 2
    AA 100 4 0
    AA 50 4 1
    AA 100 5 0

关于如何生成我的需求表的任何想法或其他方法?谢谢!

2 个答案:

答案 0 :(得分:0)

尝试以下操作:

data want(drop = start_obs limit j);
   retain start_obs 1;

   /* read by cust_id group */
   do until(last.cust_id);
      set have end = last_obs;
      by cust_id;
   end;

   limit = months;

   do j = 0 to limit;
      i = 0;

      do obs_num = start_obs + j to start_obs + limit;
         /* read specific observations using direct access */
         set have point = obs_num;

         months = j;
         output;
         i = i + 1;
      end;
   end;

   /* prepare for next direct access read */
   start_obs = limit + 2;

   if last_obs then
      stop;

run;

答案 1 :(得分:0)

这个问题有点棘手,因为您的事情朝三个方向发展

  • 组重复的次数从组数开始下降。在每个重复中:
    • 付款项目开始索引在组数处上升和终止
    • 与我一样,月份的开始索引为1,终止从组计数开始下降

SQL

一种SQL方法是三向自反加入组。 months值用作组内索引,并且必须从0开始单调从1开始。

proc sql;
  create table want as 
  select X.cust_id, Z.pmt, X.months, Y.months as i
  from have as X
  join have as Y on X.cust_id = Y.cust_id
  join have as Z on Y.cust_id = Z.cust_id
  where  
    X.months + Y.months = Z.months
  order by
    X.cust_id, X.months, Z.months
  ;
quit;

数据步骤

DOW循环用于计算组大小。 2深循环使组合交叉,并计算(确定)三个point=值以检索相关值。

data want2;
  if 0 then set have; * prep pdv to match have;

  retain point_end ;

  point_start = sum(point_end,0);

  do group_count = 1 by 1 until (last.cust_id);
    set have(keep=cust_id);
    by cust_id;
  end;

  do index1 = 1 to group_count;

    point1 = point_start + index1;
    set have (keep=months) point = point1;

    do index2 = 0 to group_count - index1 ;

      point2 = point_start + index1 + index2;
      set have (keep=pmt) point=point2;

      point3 = point_start + index2 + 1;
      set have (keep=months rename=months=i) point=point3;

      output;
    end;
  end;

  point_end = point1;

  keep cust_id pmt months i;
run;