SAS运行总计

时间:2013-07-29 17:49:06

标签: sas

我有一些样本数据如下,并且想要计算连续输赢的数量。

data have;
   input username $  betdate : datetime. stake winnings;
   dateOnly = datepart(betdate) ;
   format betdate DATETIME.;
   format dateOnly ddmmyy8.;
   datalines; 
    player1 12NOV2008:12:04:01 90 -90 
    player1 04NOV2008:09:03:44 100 40 
    player2 07NOV2008:14:03:33 120 -120 
    player1 05NOV2008:09:00:00 50 15 
    player1 05NOV2008:09:05:00 30 5 
    player1 05NOV2008:09:00:05 20 10 
    player2 09NOV2008:10:05:10 10 -10 
    player2 15NOV2008:15:05:33 35 -35 
    player1 15NOV2008:15:05:33 35 15 
    player1 15NOV2008:15:05:33 35 15 
run;
PROC PRINT; RUN;
proc sort data=have;
   by username betdate;
run;
DM "log; clear;";
data want;
   set have;
    by username dateOnly betdate;   
   retain calendarTime eventTime cumulativeDailyProfit profitableFlag;
   if first.username then calendarTime = 0;
   if first.dateOnly then calendarTime + 1;
   if first.username then eventTime = 0;
   if first.betdate then eventTime + 1;
   if first.username then cumulativeDailyProfit = 0;
   if first.dateOnly then cumulativeDailyProfit = 0;
   if first.betdate then cumulativeDailyProfit + stake;
   if winnings > 0 then winner = 1;
  if winnings <= 0 then winner = 0;
 PROC PRINT; RUN;

例如,前四个投注四个玩家1是获胜者,因此此列中的前四行应显示1,2,3,4(此时连续四次获胜)。第五个是失败者,所以应该显示-1,然后是1,2。以下三行(对于玩家3,应该显示-1,-2,-3,因为客户连续有三个赌注。如何在数据步骤中计算此列的值?我怎么能有一个连续投注数量最多的列(迄今为止)以及客户在每一行中最近输掉的投注数量?

感谢您的帮助。

1 个答案:

答案 0 :(得分:3)

要像这样执行总计,您可以将BYNOTSORTED一起使用,并仍然使用first.<var>功能。例如:

data have;
input winlose $;
datalines;
win
win
win
win
lose
lose
win
lose
win
win
lose
;;;;
run;

data want;
set have;
by winlose notsorted;
if first.winlose and winlose='win' then counter=1;
else if first.winlose then counter=-1;
else if winlose='win' then counter+1;
else counter+(-1);
run;

每次'win'更改为'lost'或反向,它会将first.winlose变量重置为1.

完成此操作后,您可以使用双DoW循环追加最大值,也可以更轻松地在数据集中获取此值,然后通过第二个datastep(或proc sql)添加它以附加所需的值变量