在矩阵中汇总组

时间:2017-12-05 13:38:17

标签: sas sas-iml

这是我拥有的数据,我使用proc表格来展示它如何在excel中呈现,以及使可视化更容易。目标是确保严格低于对角线的组(我知道它是一个矩形,(1,1)(2,2)......(7,7)“对角线”)将列卷起来直到它到达对角线或组的大小至少为75.

      1   2   3    4    5    6   7  (month variable)
(age)
  1  80  90  100  110  122  141 88
  2  80  90  100  110   56   14 88
  3  80  90   87   45   12   41 88
  4  24  90  100  110   22  141 88
  5  0   1    0    0    0    0   2
  6  0   1    0    0    0    0   6
  7  0   1    0    0    0    0   2
  8  0   1    0    0    0    0  11

我已经使用if / thens来重新组合某些数据值,但我需要一种通用的方法来为其他集合执行此操作。 提前致谢

期望的结果

   1  2   3    4    5    6   7  (month variable)
(age)
  1  80  90  100  110  122  141 88
  2  80  90  100  110   56   14 88
  3  104 90   87   45   12   41 88
  4  0   94  100  110   22  141 88
  5  0   0    0    0    0    0   2
  6  0   0    0    0    0    0   6
  7  0   0    0    0    0    0   13
  8  0   0    0    0    0    0   0

1 个答案:

答案 0 :(得分:0)

为某些需要计算的患者模拟一些分类数据

data mock;
  do patient_id = 1 to 2500;
    month = ceil(7*ranuni(123));
    age = ceil(8*ranuni(123));
    output;
  end;
  stop;
run;

创建一个计数表(N),类似于问题中显示的表:

options missing='0';

proc tabulate data=mock;
  class month age;
  table age,month*n=''/nocellmerge;
run;

每个月获得次对角线患者计数

proc sql;
/*  create table subdiagonal_column as */
  select month, count(*) as subdiag_col_freq
  from mock
  where age > month
  group by month;

对于每一行,获得对角线前患者计数

/*  create table prediagonal_row as */
  select age, count(*) as prediag_row_freq
  from mock
  where age > month
  group by age;
如果分类值不是+1单调的话,其他集合可能会很棘手。要对非单调分类值执行类似的过程,您需要创建+1单调的替代变量。例如:

data mock;
  do item_id = 1 to 2500;
    pet = scan ('cat dog snake rabbit hamster', ceil(5*ranuni(123)));
    place = scan ('farm home condo apt tower wild', ceil(6*ranuni(123)));
    output;
  end;
run;

proc tabulate data=mock;
  class pet place;
  table pet,place*n=''/nocellmerge;
run;

proc sql;
  create table unq_pets as select distinct pet from mock;
  create table unq_places as select distinct place from mock;

data pets;
  set unq_pets;
  pet_num = _n_;
run;

data places;
  set unq_places;
  place_num = _n_;
run;

proc sql;
  select distinct place_num, mock.place, count(*) as subdiag_col_freq
  from mock 
  join pets on pets.pet = mock.pet
  join places on places.place = mock.place
  where pet_num > place_num
  group by place_num
  order by place_num
  ;