Question

我有以下内容：

data have;
  input id  day $ col1;
datalines;
84 day_1_x 302
84 day_1_y 443
18 day_2_x 774
18 day_2_y 281
61 day_5_x 488
61 day_5_y 564
; 
run;

我想基于x创建两个新列y和day。

| id | day |   x |   y |
|----+-----+-----+-----|
| 84 |   1 | 302 | 443 |
| 18 |   2 | 774 | 281 |
| 61 |   5 | 488 | 564 |
|----+-----+-----+-----|

我天真的尝试并没有让我在那里。

  data dont_quite_want;
    set have;
    length x 8 y 8;

    if scan(day, 3, '_') = 'x' then x = col1;
    else y = col1; 

  run;

| Obs | id | day     | col1 | x   | y   |
|-----+----+---------+------+-----+-----|
|   1 | 84 | day_1_x |  302 | 302 | .   |
|   2 | 84 | day_1_y |  443 | .   | 443 |
|   3 | 18 | day_3_x |  774 | 774 | .   |
|   4 | 18 | day_3_y |  281 | .   | 281 |
|   5 | 61 | day_5_x |  488 | 488 | .   |
|   6 | 61 | day_5_y |  564 | .   | 564 |
|-----+----+---------+------+-----+-----|

如果有一次性的方法，我不会感到惊讶，但我想知道，如何将转换分解为单独的步骤？

Answer 1

我可能这样做的方法是进行转置 - 这需要转置，你可以一次性对数据进行查看，如果这对你很重要的话。

data for_transpose/view=for_transpose;
  set have(rename=day=long_day);
  day = scan(long_day,2,'_');
  varname = scan(long_day,3,'_');
run;

proc transpose data=for_transpose out=want;
  by id day notsorted;
  var col1;
  id varname;
run;

但是，在一个数据步骤中也很容易做到 - 只是更加硬编码。

data want;
    set have(rename=day=long_day);
    by id notsorted; *your two ID rows still need to be adjacent;
    length x 8 y 8;
    retain x y day;
    if first.id then call missing(x,y);  *reset them;
    day = scan(long_day,2,'_');
    if scan(long_day, 3, '_') = 'x' then x = col1;
    else y = col1; 
    if last.id then output;
run;

Answer 2

这是转置版本。 @Joe为您提供了完整的数据步骤。

/*Pull out the day and variable name from the day column
  Use and index so you don't have to sort*/
data have(index=(id));
set have;
d = input(scan(day,2,"_"),best.);
_name_ = scan(day,3,"_");
run;

/*Transpose the data keeping the variables you want and renaming
  as needed */
proc transpose data=have out=want(drop=_name_ rename=(d=day));
by id d;
var col1;
id _name_;
run;

SAS：根据第三列将一列转换为两列

2 个答案: