查找两个时间变量之间的丢失次数

时间:2016-06-22 01:49:42

标签: r dplyr

我有两个时间变量,

time1              time2     time difference    number of values missed
6/15/16 8:00    6/15/16 7:58       2                         2
6/16/16 8:00    6/16/16 8:03       3                         0
6/16/16 9:00    6/16/16 9:01       1                         0
6/17/16 8:00    6/17/16 8:00       0                         0
6/18/16 8:00    6/18/16 8:02       2                         2
6/19/16 8:00    6/19/16 8:00       0                         0
6/20/16 8:00    6/20/16 8:00       0                         1

我想找到两个时间变量之间最接近的时间。我想保留一些边界,例如,4分钟。如果两次之间的时差小于或等于4分钟,我想将两者等同起来,否则我想计算在极限之间没有相应的多少值。我的示例输出应该是,

Activity.runOnUiThread(Runnable)

其中time1和time2是等同的,时间差是两者之间的差数和值的数量。缺失变量将显示当前行与下一行之间没有匹配的值的数量有匹配。

我发现很难把它放入代码中。任何人都可以从这个或任何方式开始解决这个问题吗?

由于

1 个答案:

答案 0 :(得分:2)

您的数据:

times1 <- structure(c(1466002800, 1466006400, 1466010000, 1466089200, 1466092800, 
1466175600, 1466262000, 1466263800, 1466266200, 1466348400, 1466434800, 
1466445600), class = c("POSIXct", "POSIXt"), tzone = "")

times2 <- structure(c(1466002680, 1466089380, 1466092860, 1466175600, 1466262120, 
1466348400, 1466434800), class = c("POSIXct", "POSIXt"), tzone = "")

这样做你想要的吗?

library(dplyr)
expand.grid(time1 = times1, time2 = times2) %>%
  mutate(
    diff = abs(difftime(time1, time2, units = "min"))
    ) %>%
  filter(diff <= 4) %>%
  arrange(time1) %>%
  mutate(
    missed = match(time1, times1),
    missed = c(diff(missed) - 1,
               length(times1) - tail(missed, n=1))
  )
#                 time1               time2   diff missed
# 1 2016-06-15 08:00:00 2016-06-15 07:58:00 2 mins      2
# 2 2016-06-16 08:00:00 2016-06-16 08:03:00 3 mins      0
# 3 2016-06-16 09:00:00 2016-06-16 09:01:00 1 mins      0
# 4 2016-06-17 08:00:00 2016-06-17 08:00:00 0 mins      0
# 5 2016-06-18 08:00:00 2016-06-18 08:02:00 2 mins      2
# 6 2016-06-19 08:00:00 2016-06-19 08:00:00 0 mins      0
# 7 2016-06-20 08:00:00 2016-06-20 08:00:00 0 mins      1