dplyr行之间的时间差异

时间:2017-08-15 14:06:47

标签: r dplyr lubridate

我有一个以下格式的数据框,我试图找到事件'分配的'之间的时差。以及活动最后一次创建'它来到它之前。

**AccountID**              **TIME**                    **EVENT**
1                      2016-11-08T01:54:15.000Z        CREATED
1                      2016-11-09T01:54:15.000Z        ASSIGNED
1                      2016-11-10T01:54:15.000Z        CREATED
1                      2016-11-11T01:54:15.000Z        CALLED
1                      2016-11-12T01:54:15.000Z        ASSIGNED
1                      2016-11-12T01:54:15.000Z        SLEEP

目前我的代码如下,我的难点是在ASSIGNED事件之前选择CREATED

test <- timetable.filter %>%
  group_by(AccountID) %>%
  mutate(timeToAssign = ifelse(EVENT == 'ASSIGNED', 
                                interval(ymd_hms(TIME), max(ymd_hms(TIME[EVENT == 'CREATED']))) %/% hours(1), NA))

我正在寻找输出

**AccountID**              **TIME**                    **EVENT**        **timeToAssign**
1                      2016-11-08T01:54:15.000Z        CREATED         NA
1                      2016-11-09T01:54:15.000Z        ASSIGNED         12
1                      2016-11-10T01:54:15.000Z        CREATED         NA
1                      2016-11-11T01:54:15.000Z        CALLED         NA
1                      2016-11-12T01:54:15.000Z        ASSIGNED         24
1                      2016-11-12T01:54:15.000Z        SLEEP         NA

1 个答案:

答案 0 :(得分:5)

dplyrtidyr

library(dplyr); library(tidyr); library(anytime)

df %>% 
    group_by(AccountID) %>% 
    mutate(CREATED_INDEX = if_else(EVENT == 'CREATED', row_number(), NA_integer_),
           TIME = anytime(TIME)) %>% 
    fill(CREATED_INDEX) %>% 
    mutate(TimeToAssign = if_else(EVENT == 'ASSIGNED', 
                                  as.numeric(TIME - TIME[CREATED_INDEX], units = 'hours'), 
                                  NA_real_)) %>% 
    select(-CREATED_INDEX)

# A tibble: 6 x 4
# Groups:   AccountID [1]
#  AccountID                TIME    EVENT TimeToAssign
#      <int>              <dttm>   <fctr>        <dbl>
#1         1 2016-11-08 01:54:15  CREATED           NA
#2         1 2016-11-09 01:54:15 ASSIGNED           24
#3         1 2016-11-10 01:54:15  CREATED           NA
#4         1 2016-11-11 01:54:15   CALLED           NA
#5         1 2016-11-12 01:54:15 ASSIGNED           48
#6         1 2016-11-12 01:54:15    SLEEP           NA