在“开始”之前和“结束”之后删除观察值-SAS代码

时间:2019-01-15 22:51:12

标签: sas

我的桌子上有一些我想删除的前导和尾随观察值。我想删除每个组的每个“开始”事件之前和之后的行。该表如下所示:

| Time | Group | Event | Value  |

|   1  |    1  | NA    |      0 |
|   2  |    1  | NA    |      0 |
|   3  |    1  | Begin |    1.1 |
|   4  |    1  | NA    |    1.2 |
|   5  |    1  | NA    |    1.3 |
|   6  |    1  | End   |    1.4 |
|   7  |    1  | NA    |      0 |
|   1  |    2  | NA    |      0 |
|   2  |    2  | Begin |    1.1 |
|   3  |     2 | NA    |    1.2 |
|   4  |     2 | End   |    1.3 |
|   5  |     2 | NA    |    1.4 |

3 个答案:

答案 0 :(得分:2)

假设传入数据已被排序,并且每个组中从开始到结束的序列限制范围为零或多个:

TYPE-POOLS:icon.

CONSTANTS:
          c_unchecked TYPE icon_d VALUE 'T9'.

TYPES: BEGIN OF ty_data,
          carrid TYPE  s_carr_id,
          carrname TYPE  s_carrname,
          currcode TYPE  s_currcode,
          url TYPE  s_carrurl ,
          checkbox TYPE icon_d,
       END OF ty_data.
DATA t_data TYPE STANDARD TABLE OF ty_data.

START-OF-SELECTION.

SELECT CARRID,CARRNAME,CURRCODE,URL  FROM SCARR INTO TABLE @t_data.

LOOP AT t_data ASSIGNING FIELD-SYMBOL(<fs_data>).
     <fs_data>-checkbox = c_unchecked .
     WRITE:/10 <fs_data>-checkbox,
            20 <fs_data>-carrid,
            35 <fs_data>-carrname,
            60 <fs_data>-currcode,
            70 <fs_data>-url.
   ENDLOOP.

答案 1 :(得分:0)

一种方法是

data have;
input  Time  Group  Event $ Value  ;
datalines;
1      1   NA          0 
2      1   NA          0 
3      1   Begin     1.1 
4      1   NA        1.2 
5      1   NA        1.3 
6      1   End       1.4 
7      1   NA          0 
1      2   NA          0 
2      2   Begin     1.1 
3       2  NA        1.2 
4       2  End       1.3 
5       2  NA        1.4 
;


 data have2(keep= Group min_var max_var);
    set have;
   by group;
   retain min_var max_var;
     if trim(Event)= "Begin" then min_var =_n_ ;
       if trim(Event)= "End" then max_var =_n_;
      if last.group;
       run;

       data want;
          merge have have2;
       by group;
       if _n_  ge min_var and _n_  le max_var ;
         drop min_var max_var;
        run;

答案 2 :(得分:0)

我提出了一个简单的方法,但是会受到实际数据大小的限制。

data have;
input  Time  Group  Event $ Value  ;
datalines;
1      1   NA          0 
2      1   NA          0 
3      1   Begin     1.1 
4      1   NA        1.2 
5      1   NA        1.3 
6      1   End       1.4 
7      1   NA          0 
1      2   NA          0 
2      2   Begin     1.1 
3       2  NA        1.2 
4       2  End       1.3 
5       2  NA        1.4 
;
run;

proc sort data = have;
     by group time;
run;

data have1;
     set have;
     count + 1;
     by group;
     if first.group then count = -100;
     if event = 'Begin' then count = 0;
     if event = 'End' then count = 100;
     if count < 0 or count >100 then delete;
run;

如果在“开始”和“结束”之间的观察值少于100,并且在“开始”之前的观察值少于100,则当前代码可以应用于小尺寸数据。您可以根据实际数据大小调整初始计数值。