SAS总共'by'类别

时间:2013-06-06 20:31:13

标签: sas

我有以下示例数据:

data have;
   input username $ amount betdate : datetime.;
   dateOnly = datepart(betdate) ;
   format betdate DATETIME.;
   format dateOnly ddmmyy8.;
   datalines; 
player1 90 12NOV2008:12:04:01
player1 -100 04NOV2008:09:03:44
player2 120 07NOV2008:14:03:33
player1 -50 05NOV2008:09:00:00
player1 -30 05NOV2008:09:05:00
player1 20 05NOV2008:09:00:05
player2 10 09NOV2008:10:05:10
player2 -35 15NOV2008:15:05:33
run;
 PROC PRINT data=have; RUN;

proc sort data=have;
   by username betdate;
run;

data want;
   set have;
   by username dateOnly betdate;   
   retain calendarTime eventTime cumulativeDailyProfit profitableFlag totalDailyProfit;
   if first.username then calendarTime = 0;
   if first.dateOnly then calendarTime + 1;
   if first.username then eventTime = 0;
   if first.betdate then eventTime + 1;   

   if first.username then cumulativeDailyProfit= 0;
   if first.dateOnly then cumulativeDailyProfit= 0;
   if first.betdate then cumulativeDailyProfit+ amount;

   if first.dateOnly then totalDailyProfit = 0;
   if first.betdate then totalDailyProfit + amount;
 PROC PRINT data=want; RUN;

输出'cumulativeDailyProfit'中的最后一列正是我想要的:一个递增值,它增加了'amount'字段的值。但是,我不希望字段'totalDailyProfit'发生同样的情况,因为我希望这显示当天结束时的利润,即每个客户的cumulativeDailyProfit的最后一个值。

例如,上面的八列理想情况下会显示以下内容:-100,-60,-60,-60,90,120,10,-35。然后,如果该值大于0,那么我将设置'profitableFlag'布尔值,用于与当天和该客户相关的行。

这是否可以在数据步骤中实际完成?我希望能够运行以下查询(在if子句的情况下使用右侧标志)来获得平均值,获胜天数的平均值和失败天数的平均值。

proc sql;
    select calendarTime,
    mean(amount) as meanStake,
    mean(case when profitableFlag  = 1 then amount else . End) as meanLosingDayStake,
    mean(case when profitableFlag  = 1 then amount else . End) as meanWinningDayStake
    from want
    group by 1;     
quit;

1 个答案:

答案 0 :(得分:1)

尝试此查询:

proc sql;
    select calendarTime,
    avg(amount) as meanStake,
    avg(case when profitableFlag  = 1 
            then amount else 0 End) as meanLosingDayStake,
    avg(case when profitableFlag  = 1
            then amount else 0 End) as meanWinningDayStake
    from want
    group by calendarTime;     
quit;