SAS阵列无法处理一长串变量

时间:2019-02-15 11:37:37

标签: arrays sas

我正在尝试对输入数据进行对数,平方,三次和对数奇数变换,以详尽地概述单变量回归中性能最佳的变换

我在具有1,000个变量的数据集上尝试了以下代码-它返回错误/内存不足或根本无法执行。使用数组以这种方式转换变量时是否有任何限制?

/*Create a table for reference*/
DATA input_data;
    ARRAY var_[*] var_1-var_1000;

    DO i = 1 to 1000;
        DO i = 1 to 1000;
            var_(i)= i*j;
            output;
        END;
    END;
RUN;

/*Log, square, cubic, logit transform all variables*/
DATA input_transform;
    SET input_data;
    ARRAY var[*]    var_1-var_1000;
    ARRAY log[*]    log_1-log_1000;
    ARRAY logit[*]  logit_1-logit_1000;
    ARRAY sq[*]     sq_1-sq_1000;
    ARRAY cubic[*]  cubic_1-cubic_1000;

    DO i = 1 to 1000;
        log(i)      = log(var(i));
        logit(i)    = log((var(i))/(1-var(i)));
        sq(i)       = var(i)**2;
        cubic(i)    = var(i)**3;
    END;
RUN;

一个具有5000个变量的新数据集,每个变量都有各自的变换

2 个答案:

答案 0 :(得分:1)

您正在将I用作两个或两个嵌套do循环的索引变量。可能是把他们搞砸了。

您的第一个数据步骤是编写1,000,000个对1,002个变量的观察,仅填充“数组”的左下三角形。您是否真的要在循环中使用OUTPUT语句?

答案 1 :(得分:0)

从理论上讲,只要您的代码正确,就没有问题。这是一个示例和日志。

option notes;
%let size=1000;

/*Create a table for reference*/
DATA input_data;
    ARRAY var_[*] var_1-var_&size.;

    DO i = 1 to &size.;
        DO j = 1 to &size.;
            var_(j)= i*j;
        END;
        output;
    END;
RUN;

/*Log, square, cubic, logit transform all variables*/
DATA input_transform;
    SET input_data;
    ARRAY _var[*]    var_1-var_&size.;
    ARRAY _log[*]    log_1-log_&size.;
    ARRAY _logit[*]  logit_1-logit_&size.;
    ARRAY _sq[*]     sq_1-sq_&size.;
    ARRAY _cubic[*]  cubic_1-cubic_&size.;

    DO i = 1 to &size.;
        _log(i)      = log(_var(i));
         _logit(i)    = sqrt(_var(i));
        _sq(i)       = _var(i)**2;
        _cubic(i)    = _var(i)**3;
    END;
RUN;

和日志:

1576      option notes;
1577      %let size=1000;
1578
1579      /*Create a table for reference*/
1580      DATA input_data;
1581          ARRAY var_[*] var_1-var_&size.;
1582
1583          DO i = 1 to &size.;
1584              DO j = 1 to &size.;
1585                  var_(j)= i*j;
1586              END;
1587              output;
1588          END;
1589      RUN;

NOTE: The data set WORK.INPUT_DATA has 1000 observations and 1002
      variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds


1590
1591      /*Log, square, cubic, logit transform all variables*/
1592      DATA input_transform;
1593          SET input_data;
1594          ARRAY _var[*]    var_1-var_&size.;
1595          ARRAY _log[*]    log_1-log_&size.;
1596          ARRAY _logit[*]  logit_1-logit_&size.;
1597          ARRAY _sq[*]     sq_1-sq_&size.;
1598          ARRAY _cubic[*]  cubic_1-cubic_&size.;
1599
1600          DO i = 1 to &size.;
1601              _log(i)      = log(_var(i));
1602               _logit(i)    = sqrt(_var(i));
1603              _sq(i)       = _var(i)**2;
1604              _cubic(i)    = _var(i)**3;
1605          END;
1606      RUN;

NOTE: There were 1000 observations read from the data set
      WORK.INPUT_DATA.
NOTE: The data set WORK.INPUT_TRANSFORM has 1000 observations and 5002
      variables.
NOTE: DATA statement used (Total process time):
      real time           0.12 seconds
      cpu time            0.10 seconds