Question

我有这样的数据：

Incident.ID.. = c(rep("INCFI0000029582",4), rep("INCFI0000029587",4))
date = c("2014-09-25 08:39:45", "2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", "2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", "2014-10-10 23:04:00")
status = c("assigned", "in.progress", "resolved", "closed", "assigned", "resolved", "resolved", "closed")
date.diff=c (3, 56, 1347796,0 ,69 ,43, 1348116, 0)
df = data.frame(Incident.ID..,date, status, date.diff, stringsAsFactors = FALSE)

df
    Incident.ID..                date      status date.diff
1 INCFI0000029582 2014-09-25 08:39:45    assigned         3
2 INCFI0000029582 2014-09-25 08:39:48 in.progress        56
3 INCFI0000029582 2014-09-25 08:40:44    resolved   1347796
4 INCFI0000029582 2014-10-10 23:04:00      closed         0
5 INCFI0000029587 2014-09-25 08:33:32    assigned        69
6 INCFI0000029587 2014-09-25 08:34:41    resolved        43
7 INCFI0000029587 2014-09-25 08:35:24    resolved   1348116
8 INCFI0000029587 2014-10-10 23:04:00      closed         0

我想只选择某个Incident.ID的状态为“已解决”的行...当它没有跟随相同Incident.ID的状态时..“已关闭”（可能只有行“已解决”或仅“关闭”，因此这就是在进行比较时Incident.ID ..必须相同的原因。

例如，在此示例数据中，仅选择此行：

6 INCFI0000029587 2014-09-25 08:34:41    resolved        43

那我该怎么办呢？

Answer 1

这是一个简单的方法，使用dplyr按事件ID对数据进行分组，然后使用“lead”函数过滤（选择行）以查看下一行：

RTextCell.prototype = Object.create(TextCell.prototype);

Answer 2

library(data.table) #using the development version of data.table
setDT(df)[, .SD[status == "resolved" & shift(status, type = "lead") != "closed"], by = Incident.ID..]
     Incident.ID..                date   status date.diff
1: INCFI0000029587 2014-09-25 08:34:41 resolved        43

P.S。根据@David的评论更新

R比较前一行的值

2 个答案: