下面是我正在使用的表的示例。
df = data.frame(Test_ID = c('a1','a1','a1','a1','a1','a1','a1','a2','a2','a2','a2','a2','a2'),
Event_ID = c('Failure_x', 'Failure_x', 'Failure_y', 'Failure_y', 'Failure_x',
'Failure_x', 'Failure_y', 'Failure_x', 'Failure_y', 'Failure_y',
'Failure_x','Failure_x', 'Failure_y'),
Fail_Date = c('2018-10-10 17:52:20', '2018-10-11 17:02:16', '2018-10-14 12:52:20',
'2018-11-11 16:18:34', '2018-11-12 17:03:06', '2018-11-25 10:50:10',
'2018-12-01 10:28:50', '2018-09-12 19:02:08', '2018-09-20 11:32:25',
'2018-10-13 14:43:30', '2018-10-15 14:22:28', '2018-10-30 21:55:45',
'2018-11-17 11:53:35'))
我只想在Failure_x之后出现Failure_y的地方减去故障日期(按Test_ID)。从Event_ID Failure_x的Fail_Date中减去Event_ID Failure_y的Fail_Date。在一个组中,我可以有多个Failure_y。从第一个Failure_y实例之后发生的Failure_x中减去第二个Failure_y。
我尝试使用dplyr创建TIME_BETWEEN_FAILURES列。
library(lubridate)
df$Fail_Date = as.POSIXct(as.character(as.factor(df$Fail_Date)),format="%Y-%m-%d %H:%M:%S")
df = df %>% group_by(Test_ID) %>%
mutate(TIME_BETWEEN_FAILURES = ifelse(Event_ID == "Failure_y" & lag(Event_ID) == "Failure_x",
difftime(Fail_Date, first(Fail_Date),units = "hours"),''))`
我只能使用dplyr中的first()为第一个实例正确创建Time_BETWEEN_FAILURES。那就是我目前停留的地方。在这方面的任何帮助将不胜感激。
This is result from the code snippet above.
分析所需的输出。
This is ideal response needed for my analysis.
谢谢。 干杯。
答案 0 :(得分:0)
df %>%
group_by(gr = rev(cumsum(rev(Event_ID)=="Failure_y")), Test_ID) %>%
mutate(time_between_failures = ifelse(n() > 1 & Event_ID=="Failure_y", difftime(Fail_Date[n()], Fail_Date[1L], units = "hours"), NA))
# A tibble: 13 x 5
# Groups: gr, Test_ID [6]
Test_ID Event_ID Fail_Date gr time_between_failures
<fct> <fct> <dttm> <int> <dbl>
1 a1 Failure_x 2018-10-10 17:52:20 6 NA
2 a1 Failure_x 2018-10-11 17:02:16 6 NA
3 a1 Failure_y 2018-10-14 12:52:20 6 91
4 a1 Failure_y 2018-11-11 16:18:34 5 NA
5 a1 Failure_x 2018-11-12 17:03:06 4 NA
6 a1 Failure_x 2018-11-25 10:50:10 4 NA
7 a1 Failure_y 2018-12-01 10:28:50 4 449.
8 a2 Failure_x 2018-09-12 19:02:08 3 NA
9 a2 Failure_y 2018-09-20 11:32:25 3 185.
10 a2 Failure_y 2018-10-13 14:43:30 2 NA
11 a2 Failure_x 2018-10-15 14:22:28 1 NA
12 a2 Failure_x 2018-10-30 21:55:45 1 NA
13 a2 Failure_y 2018-11-17 11:53:35 1 790.