proc sql:一步根据不同条件计算多个摘要统计信息

时间:2020-06-26 05:03:49

标签: sql sas

我想使用SAS中的proc SQL计算两个平均值,一个在特定日期之前的行上,另一个在特定日期之后的行上。目前,我分两步进行操作,然后合并。有没有办法一步一步做到这一点?谢谢。

proc SQL;

create table temp.consensus_forecast_1 as 

select distinct gvkey, datadate, avg(meanest) as avg_before from temp.consensus_forecast

where cal_date < fdate

group by gvkey, fdate;

quit;


proc SQL;

create table temp.consensus_forecast_2 as 

select distinct gvkey, datadate, avg(meanest) as avg_after from temp.consensus_forecast

where cal_date > fdate

group by gvkey, fdate;

quit;

3 个答案:

答案 0 :(得分:2)

proc sql;
  create table temp.consensus_forecast as
  select gvkey, datadate,
    avg(case when cal_date < fdate then meanest else . end) as avg_before,
    avg(case when cal_date >= fdate then meanest else . end) as avg_after
  from temp.consensus_forecast
  group by gvkey, fdate;
quit;

不需要DISTINCT子句,GROUP BY将解决这个问题。使用PROC Summary也很容易

答案 1 :(得分:1)

使用SAS用true表示1,用false表示0的事实:

proc SQL;
    create table temp.consensus_forecast as 
    select distinct gvkey, datadate, 
        sum(meanest * (cal_date < fdate)) / sum(cal_date < fdate) as avg_before,
        sum(meanest * (cal_date > fdate)) / sum(cal_date > fdate) as avg_after
    from temp.consensus_forecast
    where 
    group by gvkey, fdate;
quit;

给出与您的代码相同的结果。

请注意,这可能是错误的,因为您忽略了cal_date = fdate的情况。

答案 2 :(得分:1)

选择任何内容,将avg(日期>截止时的情况,然后myValue否则为null结束)作为avg_aft,avg(当日期<=截止时,然后myValue的情况下,否则null结束)作为avg_bef 来自...等。