我有一个PROC Means和使用StackODSOutput选项的奇怪问题。考虑这个例子。
我首先创建一个虚拟数据集进行分析。
/* Step-1: Create a dummy dataset for analysis */
data ds1;
label x = 'Variable X';
label y = 'Variable Y';
do i = 1 to 100;
x = ranuni(1234);
y = ranuni(5678);
keep x y;
output;
end;
run;
然后,我使用StackODSOutput选项运行PROC MEANS。这将创建一个名为“stats”的输出数据集。
/* Step-2: I run PROC means to capture the output in a dataset called stats */
proc means data=ds1 StackODSOutput mean;
var x y;
ods output summary=stats;
run;
此“stats”数据集有一个名为“Label”的变量。我知道变量存在是因为我做了一个proc内容,我在那里看到变量。
/* Step-3: Confirm visually that there is a variable called Label in stats dataset */
proc contents data=stats varnum; run;
但是,我似乎无法在任何地方引用这个名为“Label”的变量。例如,以下PROC SQL语句会生成错误。我可以在“统计数据”数据集中引用所有其他变量而不会出现任何问题。
/* Step-4: But, I cannot seem to reference the variable called "Label" in stats dataset! */
proc sql;
select Variable, Label from stats;
quit;
错误如下:
43 proc sql;
44 select Variable, Label from stats;
ERROR: The following columns were not found in the contributing tables: Label.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
45 quit;
你知道我做错了什么吗?我的SAS代码或SAS安装有问题吗?
MY SAS版本为SAS 9.3(9.03.01M2P08152012)。
感谢。
KARTHIK。
根据Reeza的请求,这是完整的日志输出。
1 The SAS System 15:52 Wednesday, November 9, 2016
1 %_eg_hidenotesandsource;
5 %_eg_hidenotesandsource;
20
21 /* Step-1: Create a dummy dataset for analysis */
22 data ds1;
23 label x = 'Variable X';
24 label y = 'Variable Y';
25 do i = 1 to 100;
26 x = ranuni(1234);
27 y = ranuni(5678);
28 keep x y;
29 output;
30 end;
31 run;
NOTE: The data set WORK.DS1 has 100 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
32
33 /* Step-2: I run PROC means to capture the output in a dataset called stats */
34 proc means data=ds1 StackODSOutput mean;
35 var x y;
36 ods output summary=stats;
37 run;
NOTE: The data set WORK.STATS has 2 observations and 3 variables.
NOTE: There were 100 observations read from the data set WORK.DS1.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.06 seconds
cpu time 0.03 seconds
38
39 /* Step-3: Confirm visually that there is a variable called Label in stats dataset */
40 proc contents data=stats varnum; run;
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
41
42 /* Step-4: But, I cannot seem to reference the variable called "Label" in stats dataset! */
43 proc sql;
44 select Variable, Label from stats;
ERROR: The following columns were not found in the contributing tables: Label.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
45 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
2 The SAS System 15:52 Wednesday, November 9, 2016
46 /* What! */
47
48
49
50 %_eg_hidenotesandsource;
62
63
64 %_eg_hidenotesandsource;
67
答案 0 :(得分:1)
我在运行代码时遇到了与您相同的问题。我有SAS 9.4,我在Linux上运行。以下是我对此问题的评价:
data _NULL_;
set stats;
put _all_;
run;
显示“标签”的变量名称不像它看起来那样:
22 data _NULL_;
23 set stats;
24 put _all_;
25 run;
Variable=x Label =Variable X Mean=0.461116 _ERROR_=0 _N_=1
Variable=y Label =Variable Y Mean=0.525342 _ERROR_=0 _N_=2
注意变量名称“Label”和等号字符之间的空格。其他变量都不是这样的。也许变量的名称已损坏。
将dictionary.columns
表中的变量名称加载到另一个表中并查看值:
proc sql;
create table x as
select name as nm from dictionary.columns
where libname = 'WORK' and memname = 'STATS';
quit;
data _NULL_;
set x;
put nm= nm $hex32.;
run;
$HEX32.
格式将文本转换为ASCII码,因此您可以查看其中是否存在任何不可打印的字符。此datastep的输出是:
22 data _NULL_;
23 set x;
24 put nm= nm $hex32.;
25 run;
nm=Variable 5661726961626C652020202020202020
nm=Label 4C6162656C0000002020202020202020
nm=Mean 4D65616E202020202020202020202020
查看Label变量,首先它与下一个输出之间仍然存在差距。十六进制代码包含一些重复的零:
4C6162656C0000002020202020202020
4C=L
61=a
62=b
65=e
6C=l
00=?
20=<space>
因此,'Label'变量名中的这些ASCII零引起了问题。 SAS只能将其显示为“标签”,其中这些ASCII零(a.k.a ASCII NULL)显示为空格。
<强>修正强>
我不知道引用包含ASCII特殊字符的列的方法,所以我们可以做的是重命名列。但是,我们仍然不能通过名称引用“标签”,因此我们需要间接引用它。一种方法是使用数组:
data stats_fix;
set stats;
array c{*} _CHARACTER_;
var=c[1];
Label=c[2];
run;
查看输出数据集非常奇怪。数据集有两个名为“Label”的变量。我们知道一个是'Label',另一个是'Label000000'。
作为PROC MEANS
中的错误,SAS技术支持可能值得提出这个问题,您可以随意使用这个答案。
答案 1 :(得分:1)
这是一个错误。它创建一个带有尾随空格的变量名,如果validvarname设置为ANY,则可以在EG中创建。
修正:
Option validvarname=V7;