R比较前一行的值

时间:2015-05-22 11:44:20

标签: r dataframe

我有这样的数据:

Incident.ID.. = c(rep("INCFI0000029582",4), rep("INCFI0000029587",4))
date = c("2014-09-25 08:39:45", "2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", "2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", "2014-10-10 23:04:00")
status = c("assigned", "in.progress", "resolved", "closed", "assigned", "resolved", "resolved", "closed")
date.diff=c (3, 56, 1347796,0 ,69 ,43, 1348116, 0)
df = data.frame(Incident.ID..,date, status, date.diff, stringsAsFactors = FALSE)

df
    Incident.ID..                date      status date.diff
1 INCFI0000029582 2014-09-25 08:39:45    assigned         3
2 INCFI0000029582 2014-09-25 08:39:48 in.progress        56
3 INCFI0000029582 2014-09-25 08:40:44    resolved   1347796
4 INCFI0000029582 2014-10-10 23:04:00      closed         0
5 INCFI0000029587 2014-09-25 08:33:32    assigned        69
6 INCFI0000029587 2014-09-25 08:34:41    resolved        43
7 INCFI0000029587 2014-09-25 08:35:24    resolved   1348116
8 INCFI0000029587 2014-10-10 23:04:00      closed         0

我想只选择某个Incident.ID的状态为“已解决”的行...当它没有跟随相同Incident.ID的状态时..“已关闭”(可能只有行“已解决”或仅“关闭”,因此这就是在进行比较时Incident.ID ..必须相同的原因。

例如,在此示例数据中,仅选择此行:

6 INCFI0000029587 2014-09-25 08:34:41    resolved        43 

那我该怎么办呢?

2 个答案:

答案 0 :(得分:3)

这是一个简单的方法,使用dplyr按事件ID对数据进行分组,然后使用“lead”函数过滤(选择行)以查看下一行:

RTextCell.prototype = Object.create(TextCell.prototype);

答案 1 :(得分:3)

library(data.table) #using the development version of data.table
setDT(df)[, .SD[status == "resolved" & shift(status, type = "lead") != "closed"], by = Incident.ID..]
     Incident.ID..                date   status date.diff
1: INCFI0000029587 2014-09-25 08:34:41 resolved        43

P.S。根据@David的评论更新