我有一个data.table,如:
ID Time Event
1: 1 2016-09-25 14:47:52 1
2: 1 2016-10-03 19:35:04 1
3: 1 2016-10-03 21:11:00 -1
4: 1 2016-10-04 14:25:56 1
5: 1 2016-11-05 01:40:13 1
6: 1 2016-11-27 04:40:21 1
7: 1 2016-12-04 02:36:37 1
8: 1 2017-01-12 13:48:01 1
9: 1 2017-01-15 03:32:35 1
10: 1 2017-02-05 01:35:07 1
11: 1 2017-02-05 02:29:31 1
12: 1 2017-02-05 02:34:33 1
13: 2 2016-07-15 08:14:11 1
14: 2 2016-07-22 22:15:44 1
15: 2 2016-07-23 12:00:00 -1
16: 2 2016-11-30 18:21:51 1
17: 2 2016-12-03 07:00:31 1
18: 2 2016-12-06 06:30:34 1
19: 2 2016-12-16 10:00:50 1
20: 2 2017-01-16 08:33:16 1
我正在尝试检查在按ID分组的否定事件后是否发生了积极事件。我的理想输出是data.table with:
ID Outcome
1 TRUE
2 TRUE
我不知道如何制定应考虑时间列和事件列的过滤条件:我想知道,对于给定的ID,是否有Event = 1 with Time>事件-1的时间......但我无法在代码中表达这一点......任何人都可以提供帮助吗?
我在这附上一个演示数据集:
fakedata <- structure(list(ID = c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L,
2L, 2L), Time = c("2016-09-25 14:47:52", "2016-10-03 19:35:04",
"2016-10-03 21:11:00", "2016-10-04 14:25:56", "2016-11-05 01:40:13",
"2016-11-27 04:40:21", "2016-12-04 02:36:37", "2017-01-12 13:48:01",
"2017-01-15 03:32:35", "2017-02-05 01:35:07", "2017-02-05 02:29:31",
"2017-02-05 02:34:33", "2016-07-15 08:14:11", "2016-07-22 22:15:44",
"2016-07-23 12:00:00", "2016-11-30 18:21:51", "2016-12-03 07:00:31",
"2016-12-06 06:30:34", "2016-12-16 10:00:50", "2017-01-16 08:33:16"
), Event = c(1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1,
1, 1, 1, 1)), .Names = c("ID", "Time", "Event"), class = c("data.table",
"data.frame"), row.names = c(NA, -20L))
答案 0 :(得分:1)
以下是使用基本R函数data.table
和any
以及which
运算符的&&
方法。
fakedata[order(ID, as.POSIXct(Time)),
.(outcome=any(Event == -1) && Event[which(Event == -1)+1] > 0), by=ID]
ID outcome
1: 1 TRUE
2: 2 TRUE
正如评论中提到的david-arenburg,如果确保在计算之前正确地订购数据集是个好主意。对于data.table
,我们可以在i参数中执行此操作。根据david-arenburg的评论,我在ID上订购了它,然后在as.POSIXct(Time)
上订购。
在j参数中,.(outcome=any(Event==-1) && Event[which(Event == -1)+1] > 0)
,any(Event == -1)
检查是否存在-1,如果是,则Event[which(Event == -1)+1] > 0)
检查在每个实例中是否存在-1,紧接着事件的价值是积极的。如果第一个实例失败,则返回FALSE。