我正在使用SAS,我有一个这样的数据框:
表1
+------+------------+-----------+
| name | date | time |
+------+------------+-----------+
| A | 7-May-08 | 09:01:41 |
| A | 7-May-08 | 09:01:41 |
| A | 7-May-08 | 09:03:20 |
| A | 7-May-08 | 09:04:41 |
| A | 7-May-08 | 11:32:41 |
| A | 8-May-08 | 09:06:00 |
| A | 8-May-08 | 09:06:01 |
| A | 8-May-08 | 12:32:41 |
| B | 7-May-08 | 09:00:01 |
| B | 7-May-08 | 09:00:01 |
| B | 7-May-08 | 11:33:41 |
| B | 9-May-08 | 09:05:59 |
| B | 9-May-08 | 11:35:41 |
| B | 9-May-08 | 11:36:41 |
| B | 9-May-08 | 11:37:41 |
| B | 12-May-08 | 11:27:41 |
| B | 12-May-08 | 11:27:41 |
+------+------------+-----------+
现在我想做两个主要的操作:
1-如果每个名称和日期变量的时间值在9:00:01和9:05:59之间,则删除此间隔中的第一行;
2-对于上一步,如果下一行的时间值与第一行相同,则在此时间间隔内删除所有行。
例如,table1最后应该是这样的:
+------+-----------+----------+
| name | date | time |
+------+-----------+----------+
| A | 7-May-08 | 09:03:20 |
| A | 7-May-08 | 9:04:41 |
| A | 7-May-08 | 11:32:41 |
| A | 8-May-08 | 9:06:00 |
| A | 8-May-08 | 9:06:01 |
| A | 8-May-08 | 12:32:41 |
| B | 7-May-08 | 11:33:41 |
| B | 9-May-08 | 11:35:41 |
| B | 9-May-08 | 11:36:41 |
| B | 9-May-08 | 11:37:41 |
| B | 12-May-08 | 11:27:41 |
| B | 12-May-08 | 11:27:41 |
+------+-----------+----------+
我该怎么做?
答案 0 :(得分:1)
我假设数据按time
为每组name
+ date
排序,目的是"如果下一行的时间值与第一行"。
然后,查询非常简单:
data have;
input @1 name $1 @3 date date11. @13 time time.;
format date date11.;
format time time.;
datalines;
A 7-May-08 09:01:41
A 7-May-08 09:01:41
A 7-May-08 09:03:20
A 7-May-08 09:04:41
A 7-May-08 11:32:41
A 8-May-08 09:06:00
A 8-May-08 09:06:01
A 8-May-08 12:32:41
B 7-May-08 09:00:01
B 7-May-08 09:00:01
B 7-May-08 11:33:41
B 9-May-08 09:05:59
B 9-May-08 11:35:41
B 9-May-08 11:36:41
B 9-May-08 11:37:41
B 12-May-08 11:27:41
B 12-May-08 11:27:41
;
run;
proc sql;
create table want as select *
from have
group by name, date
having min(time) not between '09:00:01't and '09:05:59't
or time ne min(time)
;quit;