Question

我有一个数据集，其变量代表两种信息：变量测量和类别。

例如，Var1A测量A类的第一个变量（例如血压）（例如男性/女性），而Var2B测量B类的第二个变量（例如心率）（例如，男/女）。

Key Var1A Var2A Var1B Var2B
--- ----- ----- ----- -----
002   1     2     3     4
031   5     6     7     8
028   9     10    11    12

我需要在类别类型中压缩每个测量变量。

Key Type Var1 Var2
--- ---- ---- ----
002   A    1    2
002   B    3    4
028   A    9    10
028   B    11   12
031   A    5    6
031   B    7    8

对精简数据集进行排序对我来说并不重要。

我提出的工作和产生上面的数据集。我基本上粗暴地强迫/摆弄我的方式来解决这个问题。但是，我想知道是否有更直接/更直观的方法来做到这一点，可能无需先排序并丢弃这么多变量。

data have;
  input key $ @@  Var1A Var2A Var1B Var2B;

  datalines;
  002 1 2   3   4
  031 5 6   7   8
  028 9 10  11  12
  ;
run;

proc sort data = have out = step1_sort;
  by key;
run;

proc transpose data = step1_sort out = step2_transpose;
  by key;
run;

data step3_assign_type_and_variable (drop = _NAME_);
  set step2_transpose ;

  if      _NAME_ = 'Var1A' then do;
      variable = 'Var1';
      type = 'A';
    end;
  else if _NAME_ = 'Var1B' then do;
      variable = 'Var1';
      type = 'B';
    end;
  else if _NAME_ = 'Var2A' then do;
      variable = 'Var2';
      type = 'A';
    end;
  else if _NAME_ = 'Var2B' then do;
      variable = 'Var2';
      type = 'B';
    end;
run;

proc transpose  data = step3_assign_type_and_variable 
                out  = step4_get_want (drop = _NAME_);
  var col1;
  by key type;
  id variable;
run;

Answer 1

我提出了相同的方法，除了用更清晰的子串替换你的暴力：

** use this step to replace your brute force code **;
data step3_assign_type_and_variable; set step2_transpose;
    type = upcase(substr(_name_,length(_name_),1));
    variable = propcase(substr(_name_,1,4));
    drop _name_;
run;

SAS：压缩不同类别的单独测量变量

1 个答案: