我试图在PROC SQL中按自定义格式化变量进行分组,但到目前为止还没有找到解决方案。日志没有错误(如summary statistic error here),所有代码都有效。这是一个简单的例子:
DATA have;
INPUT value1;
DATALINES;
1.22
0.99
0.22
4.00
9.99
;
RUN;
PROC FORMAT;
value valuefmt
low-.99="Below $1.00"
1-5="$1-5.00"
5-high="Above $5.00";
RUN;
DATA have;
set have;
FORMAT value1 valuefmt.;
RUN;
PROC SQL;
SELECT count(*), value1 from have group by value1;
QUIT;
PROC SQL返回按原始值(value1)分组的计数,而不是格式化的值:
value1
~~~~~~~~~~~~~~~~~~~~~
1 Below $1.00
1 Below $1.00
1 $1-5.00
1 $1-5.00
1 Above $5.00
SAS通过FREQ或TABULATE允许此功能。例如:
PROC TABULATE data=have;
CLASS value1;
TABLE value1;
RUN;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| value1 |
|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
|Below $1.00 | $1-5.00 |Above $5.00 |
|~~~~~~~~~~~~+~~~~~~~~~~~~+~~~~~~~~~~~~|
| N | N | N |
|~~~~~~~~~~~~+~~~~~~~~~~~~+~~~~~~~~~~~~|
| 2.00| 2.00| 1.00|
-~~~~~~~~~~~~-~~~~~~~~~~~~-~~~~~~~~~~~~-
有关如何使用PROC SQL执行类似操作的任何想法?
答案 0 :(得分:3)
一种方法是在GROUP BY子句中使用PUT()函数。您只需返回格式化的值即可。
proc sql;
select count(*) as N
, put(value1,valuefmt.) as CharacterValue
from have
group by 2
;
quit;
否则,如果要返回原始类型的值,则需要添加MIN()等聚合函数。您还需要重新应用格式。
proc sql;
select count(*) as N
, min(value1) as Formmatted format=valuefmt.
, min(value1) as Raw
from have
group by put(value1,valuefmt.)
;
quit;
结果
N Formmatted Raw
-------------------------------
2 $1-5.00 1.22
1 Above $5.00 9.99
2 Below $1.00 0.22