计算两个日期时间之间15分钟的序列发生

时间:2019-03-29 14:27:00

标签: r datetime

我有一张日期时间表,希望为每一行(例如,每行)以15分钟的顺序在00:00到00:00的小时中添加一个新的数据框。 00:00,00:15,... 23:45在这里,我试图查找某个工人在其日程表中工作的次数。

请注意,日期时间格式为d-m-Y h:m

我(已更改为数据)

worker           Start_shift        End_shift          difference        
Worker  130    30-05-2018 15:00   01-06-2018 08:15   41.25     
Worker  130    15-06-2018 15:00   16-06-2018 09:00   18.00     
Worker  130    22-03-2018 15:00   23-03-2018 08:15   17.25     
Worker  130    27-02-2018 15:00   28-02-2018 10:00   19.00     
Worker  130    30-05-2018 15:00   01-06-2018 08:15   41.25     
Worker  18    27-04-2018 15:00   29-04-2018 07:24   40.40     
Worker  11    29-03-2018 16:00   31-03-2018 07:24   39.40     
Worker  11    25-03-2018 16:00   27-03-2018 07:24   39.40     

我希望此输出有一个新的数据框。在哪里可以看到它们在不同的时间戳下工作多少

这只是所需输出的示例,而不是上述数据集的真实输出。以下计数可能是错误的。

            00:00 | 00:15 | 00:30 | ... | 23:45 
worker 130     5      5       6       ..    4
worker 18      2      5       5       ..    3
worker 11      1      1       1       ..    1

我尝试使用seq()调用创建一个15分钟的序列。

seq15 <- seq(lubridate::as_datetime(paste0(DATE_Start, " 00:00:00"), format="%Y-%m-%d %H:%M:%S", tz = "UTC"), lubridate::as_datetime(paste0(DATE_End, " 00:00:00"), format="%Y-%m-%d %H:%M:%S", tz = "UTC"), by = "15 mins")

但是,随着班次的延长,我无法将时间戳记加在一起

任何帮助将不胜感激

dput在下面

structure(list(Start_shift = c("30-05-2018 15:00", "15-06-2018 15:00", 
"22-03-2018 15:00", "27-02-2018 15:00", "30-05-2018 15:00", "27-04-2018 15:00", 
"29-03-2018 16:00", "29-03-2018 16:00"), End_shift = c("01-06-2018 08:15", 
"16-06-2018 09:00", "23-03-2018 08:15", "28-02-2018 10:00", "01-06-2018 08:15", 
"29-04-2018 07:24", "31-03-2018 07:24", "31-03-2018 07:24"), 
    difference = structure(c(41.25, 18, 17.25, 19, 41.25, 40.4, 
    39.4, 39.4), class = "difftime", units = "mins"), worker = structure(c(30L, 
    30L, 30L, 30L, 30L, 8L, 1L, 1L), .Label = c("Worker  11", 
    "Worker  12", "Worker  13", "Worker  14", "Worker  15", "Worker  16", 
    "Worker  17", "Worker  18", "Worker  19", "Worker  110", 
    "Worker  111", "Worker  112", "Worker  113", "Worker  114", 
    "Worker  115", "Worker  116", "Worker  117", "Worker  118", 
    "Worker  119", "Worker  120", "Worker  121", "Worker  122", 
    "Worker  123", "Worker  124", "Worker  125", "Worker  126", 
    "Worker  127", "Worker  128", "Worker  129", "Worker  130", 
    "Worker  131", "Worker  132", "Worker  133", "Worker  134", 
    "Worker  135", "Worker  136", "Worker  137", "Worker  138", 
    "Worker  139", "Worker  140"), class = "factor")), row.names = c(7052L, 
7053L, 7054L, 7055L, 7074L, 1767L, 21L, 58L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

我正在使用您发布为dt的数据:

library(tidyverse)
library(lubridate)


dt %>%
  mutate(Start_shift = dmy_hm(Start_shift),
         End_shift = dmy_hm(End_shift)) %>%           # update to datetime
  rowwise() %>%                                       # for each row
  mutate(date_vec = list(seq(Start_shift, 
                             End_shift, 
                             by = "15 mins"))) %>%    # create a vector of 15 min distance date-times
  ungroup() %>%                                       # forget the grouping
  unnest() %>%                                        # unnest vector of date-times
  mutate(time = substr(date_vec, 12,16)) %>%          # keep only hr-mins
  count(worker, time) %>%                             # count combinations
  spread(time, n)                                     # reshape

还有一种更紧凑的解决方案,它使用map来代替rowwise,同时生成日期时间向量并同时保持hr-mins:

dt %>%
  mutate(Start_shift = dmy_hm(Start_shift),
         End_shift = dmy_hm(End_shift),      
         time = map2(Start_shift, End_shift, ~substr(seq(.x, .y, by = "15 mins"), 12, 16))) %>%
  unnest(time) %>%
  count(worker, time) %>%                          
  spread(time, n)