如何在SAS上过去90天统计不同的ID?

时间:2017-01-18 22:13:06

标签: sas

我有这样的数据集:

CustomerID  AccountManager TransactionID  Transaction_Time 
1111111111  FA001          TR2016001      08SEP16:11:19:25
1111111111  FA001          TR2016002      26OCT16:08:22:49
1111111111  FA002          TR2016003      04NOV16:08:05:36
1111111111  FA003          TR2016004      04NOV16:17:15:52
1111111111  FA004          TR2016005      25NOV16:13:04:16
1231231234  FA005          TR2016006      25AUG15:08:03:29
1231231234  FA005          TR2016007      16SEP15:08:24:24
1231231234  FA008          TR2016008      18SEP15:14:42:29

CustomerID代表每个客户,每个客户可以有多个交易。每个客户经理也可以处理多个交易。但是,transactionID在此表中是唯一的。

现在我想为每个客户计算,当转换发生时,如果我回到过去90天,有多少不同的客户经理参与,以及发生了多少交易。我正在寻找的结果是这样的:

CustomerID  Manager TransacID  Transaction_Time    CountTransac CountManager
1111111111  FA001   TR2016001  08SEP16:11:19:25    1            1
1111111111  FA001   TR2016002  26OCT16:08:22:49    2            1
1111111111  FA002   TR2016003  04NOV16:08:05:36    3            2
1111111111  FA003   TR2016004  04NOV16:17:15:52    4            3
1111111111  FA004   TR2016005  25NOV16:13:04:16    5            4
1231231234  FA005   TR2016006  25AUG15:08:03:29    1            1
1231231234  FA005   TR2016007  16SEP15:08:24:24    2            1
1231231234  FA008   TR2016008  18SEP15:14:42:29    3            2

现在使用以下代码,我弄清楚如何计算事务计数,但我不知道如何计算不同的管理器计数。如果有人可以帮助我,我们将非常感激。非常感谢。

DATA want;
    SET transaction;
    COUNT=1;
    DO point=_n_-1 TO 1 BY -1;
        SET want(KEEP=CustomerID Transaction_Time COUNT POINT=point
            RENAME=(CustomerID =SAME_ID Transaction_Time =OTHER_TIME COUNT=OTHER_COUNT));

        IF CustomerID NE SAME_ID 
            OR INTCK ("DAY", DATEPART(OTHER_TIME), DATEPART(Transaction_Time )) > 90 
            THEN LEAVE;   
        COUNT + OTHER_COUNT;
    END;
    DROP SAME_ID OTHER_TIME OTHER_COUNT;
    RENAME COUNT=COUNT_TRANSAC;
RUN;

1 个答案:

答案 0 :(得分:3)

您的代码根本不起作用,但我知道您想要做什么。这是有效的。我注释掉了WHERE语句,因此您可以看到它产生了您要求的结果。如果您真的只想要过去90天,则需要WHERE声明。

* Always a good idea to sort first unless you are CERTAIN that
* your values are in the order you want.;
proc sort data=have;
    by customerid AccountManager transactionid;
run;

DATA want;
    SET have;
* Uncomment the WHERE statement to activate the 90-day time frame.;
*   where today()-datepart(transaction_time)<=90;
    by customerid AccountManager transactionid;
    if first.customerid
     then do;
        counttransac=0;
        countmanager=0;
     end;
    if first.AccountManager
     then countmanager+1;

    counttransac+1;

RUN;

利用SAS的BY声明以及first.last.变量修饰符,您可以在每次看到新的客户ID和经理ID时重置计数器。

[编辑]好的,那要困难得多。这是在每次交易之前回顾历史的代码。我明白为什么你使用两个SET语句,因为你必须将数据集加入到自身中。可能你可以使用PROC SQL执行此操作,但我没有时间查看它。如果这对您有用,请告诉我。

* Sort each customer's and manager's transactions;
proc sort data=transaction;
    by customerid accountmanager;
run;


DATA want;
    SET transaction nobs=pmax;
    by customerid;

    length lastmgr $ 100;
    retain pstart;      * Starting row for each customer;

    * Save starting row for each customer;
    if first.customerid
     then pstart=_n_;

    * Initialize current account manager and counters for
    * managers and transactions. The current transaction always
    * counts as one transaction and one manager.
    * Save the beginning of the 90-day period to avoid 
    * recalculating it each time.;
    lastmgr=accountmanager;
    mgrct=1;
    tranct=1;
    ninetyday=datepart(transaction_time)-90;

    * Set the starting row to search for each transaction;
    p=pstart;

    * Loop through all rows for the customer and only count
    * those that occur before the current transaction and
    * after the 90-day period before it.;
    * Note that the transactions are not necessarily sorted
    * in chronological order but rather in groups by customer
    * and manager, so we have to look through all of the
    * customer's transactions each time.;
    * DO UNTIL(0) means loop forever, so be careful that
    * there is always a LEAVE statement executed.;
    do until(0);

        * p > pmax means the end of the transaction list, so stop.;
        if p > pmax
         then leave;

        set transaction (keep=customerid accountmanager transaction_time
                  rename=(customerid=cust2 accountmanager=mgr2 transaction_time=tt2))
            point=p;

        * When customer ID changes, we are done with the loop.;
        if cust2 ~= customerid
         then leave;
         else do;
            * To be counted, the transaction needs to be within the 
            * 90-day period. Using "<" for the transaction time pre-
            * vents counting the current transaction twice.;
            if datepart(tt2) >= ninetyday and tt2 < transaction_time
             then do;
                tranct=tranct+1;
                if mgr2 ~= lastmgr
                 then do;
                    mgrct=mgrct+1;
                    lastmgr=mgr2;
                 end;
             end;
          end;

        * Look at the next transaction.;
        p=p+1;

    end;

    keep CustomerID AccountManager TransactionID Transaction_Time tranct mgrct;

RUN;

[编辑]这是一个有效的PROC SQL方法。它是by Tom in answer to my question here关于如何创建一个优雅的查询来完成任务:

proc sql noprint ;
 create table want as
   select a.*
        , count(distinct b.accountmanager) as mgrct
        , count(*) as tranct
   from transaction a
   left join transaction b
   on a.customerid = b.customerid
    and b.transaction_time <= a.transaction_time
    and datepart(a.transaction_time)-datepart(b.transaction_time)
        between 0 and 90
   group by 1,2,3,4
 ;
quit;