尝试使用PROC SORT对非常大的数据集进行排序,这会导致异常错误

时间:2014-04-24 16:03:46

标签: sorting sas

在LINUX和数据集上运行这段代码在MAINFRAME上并且有60000000+封...

proc sort data=test_history force;
by acct score;
run;

我收到以下错误...

NOTE: There were 67397829 observations read from the data set test_HISTORY.
435 ERROR: Failure while merging sorted runs from utility file 1 to final output.
436 ERROR: Failure encountered during external sort.
437 ERROR: Attempt to communicate with server AMDAHL refused by server.  The current request failed.
438 NOTE: The SAS System stopped processing this step because of errors.
439 NOTE: SAS set option OBS=0 and will continue to check statements. This might cause NOTE: No observations in data set.
440 WARNING: The data set test_HISTORY may be incomplete.  When this step was stopped there were 20002488 observations and 148
441          variables.
442 ERROR: The connection to server AMDAHL has been lost.  The current request failed.  This error may reoccur on subsequent requests.

1 个答案:

答案 0 :(得分:3)

请参阅此SUGI paper

指示了几个选项,以减少在大型机环境中对大型数据集进行排序时出错的可能性。我在下面粘贴了一个选项。

此代码限制SAS代码中的排序工作区域的数量...使用SOTWKNO选项作为全局选项或PROC SORT选项。此选项确定最大数量 对PROC SORT允许使用的工作区域进行排序。

options SORTWKNO=3;
proc sort test_history SORTWKNO=5;
by acct score;
run;