我正在尝试学习SAS,尤其是PROC REPORT。我正在使用SASHELP.CARS数据集。
我要在输出的第6列中实现的目标,将其标记为“汽车数量>平均值(发票)”,以计算其发票大于组发票平均值的汽车数量。我正在使用下面的代码。
PROC REPORT DATA=sashelp.CARS NOWD OUT=learning.MyFirstReport;
COLUMNS Type Origin INVOICE=Max_INVOICE INVOICE=Mean_Invoice
INVOICE=Count_Invoice TEST DriveTrain;
DEFINE Type / Group 'Type of Car' CENTER;
DEFINE Origin / Group 'Origin of Car' CENTER;
DEFINE Max_Invoice / ANALYSIS MAX 'Max of Invoice';
DEFINE Mean_Invoice / ANALYSIS MEAN 'Mean of Invoice';
DEFINE Count_Invoice / ANALYSIS N FORMAT=5.0 'Total Number of Cars' center;
DEFINE DriveTrain / ACROSS 'Type of DriveTrain of Car';
DEFINE TEST / COMPUTED 'Number of Cars > Mean(Invoice)' center;
COMPUTE TEST;
TEST=N(_c7_>Mean_Invoice);
ENDCOMP;
RUN;
我得到的输出在下图中。
我认为这不是正确的输出,因为该列中的所有行都显示值为1。如何在输出的第6列中获得所需的输出?
答案 0 :(得分:0)
正在定义非组列分析以计算聚合统计信息。实现逻辑评估计数的一种方法是准备数据,以使单个标志(0或1)的SUM聚合为肯定断言的计数。
准备
proc sql;
create view cars_v as
select *
, mean(invoice) as invoice_mean_over_type_origin
, (invoice > calculated invoice_mean_over_type_origin) as flag_50
from sashelp.cars
group by type, origin
;
报告
PROC REPORT DATA=CARS_V OUT=work.MyFirstReport;
COLUMNS
Type
Origin
INVOICE/*=Max_INVOICE */
INVOICE=INVOICE_use_2/*=Mean_Invoice */
flag_50
flag_50=flag_50_use_2
flag_50_other
DriveTrain
;
DEFINE Type / Group 'Type of Car' CENTER;
DEFINE Origin / Group 'Origin of Car' CENTER;
DEFINE Invoice / ANALYSIS MAX 'Max of Invoice';
DEFINE Invoice_use_2 / ANALYSIS MEAN 'Mean of Invoice';
DEFINE flag_50 / analysis sum 'Number of Cars > Mean ( Invoice )' center;
DEFINE flag_50_use_2 / noprint analysis N ;
* noprint makes a hidden column whose value is available to compute blocks;
DEFINE flag_50_other / computed 'Number of Cars <= Mean ( Invoice )' center;
DEFINE DriveTrain / ACROSS 'Type of DriveTrain of Car';
compute flag_50_other;
flag_50_other = flag_50_use_2 - flag_50.sum;
endcomp;
RUN;
NOWD
是默认选项。新的Proc REPORT
代码无需明确指定。invoice=mean_invoice
之类的变量,但是将来的代码阅读者在看到DEFINE Mean_Invoice / ANALYSIS MEAN 'Mean of Invoice';
行代码时可能会有一些误解-是 mean < / em>或平均值的平均值
?