我有一张日期时间表,希望为每一行(例如,每行)以15分钟的顺序在00:00到00:00的小时中添加一个新的数据框。 00:00,00:15,... 23:45在这里,我试图查找某个工人在其日程表中工作的次数。
请注意,日期时间格式为d-m-Y h:m
我(已更改为数据)
worker Start_shift End_shift difference
Worker 130 30-05-2018 15:00 01-06-2018 08:15 41.25
Worker 130 15-06-2018 15:00 16-06-2018 09:00 18.00
Worker 130 22-03-2018 15:00 23-03-2018 08:15 17.25
Worker 130 27-02-2018 15:00 28-02-2018 10:00 19.00
Worker 130 30-05-2018 15:00 01-06-2018 08:15 41.25
Worker 18 27-04-2018 15:00 29-04-2018 07:24 40.40
Worker 11 29-03-2018 16:00 31-03-2018 07:24 39.40
Worker 11 25-03-2018 16:00 27-03-2018 07:24 39.40
我希望此输出有一个新的数据框。在哪里可以看到它们在不同的时间戳下工作多少
这只是所需输出的示例,而不是上述数据集的真实输出。以下计数可能是错误的。
00:00 | 00:15 | 00:30 | ... | 23:45
worker 130 5 5 6 .. 4
worker 18 2 5 5 .. 3
worker 11 1 1 1 .. 1
我尝试使用seq()调用创建一个15分钟的序列。
seq15 <- seq(lubridate::as_datetime(paste0(DATE_Start, " 00:00:00"), format="%Y-%m-%d %H:%M:%S", tz = "UTC"),
lubridate::as_datetime(paste0(DATE_End, " 00:00:00"), format="%Y-%m-%d %H:%M:%S", tz = "UTC"), by = "15 mins")
但是,随着班次的延长,我无法将时间戳记加在一起
任何帮助将不胜感激
dput在下面
structure(list(Start_shift = c("30-05-2018 15:00", "15-06-2018 15:00",
"22-03-2018 15:00", "27-02-2018 15:00", "30-05-2018 15:00", "27-04-2018 15:00",
"29-03-2018 16:00", "29-03-2018 16:00"), End_shift = c("01-06-2018 08:15",
"16-06-2018 09:00", "23-03-2018 08:15", "28-02-2018 10:00", "01-06-2018 08:15",
"29-04-2018 07:24", "31-03-2018 07:24", "31-03-2018 07:24"),
difference = structure(c(41.25, 18, 17.25, 19, 41.25, 40.4,
39.4, 39.4), class = "difftime", units = "mins"), worker = structure(c(30L,
30L, 30L, 30L, 30L, 8L, 1L, 1L), .Label = c("Worker 11",
"Worker 12", "Worker 13", "Worker 14", "Worker 15", "Worker 16",
"Worker 17", "Worker 18", "Worker 19", "Worker 110",
"Worker 111", "Worker 112", "Worker 113", "Worker 114",
"Worker 115", "Worker 116", "Worker 117", "Worker 118",
"Worker 119", "Worker 120", "Worker 121", "Worker 122",
"Worker 123", "Worker 124", "Worker 125", "Worker 126",
"Worker 127", "Worker 128", "Worker 129", "Worker 130",
"Worker 131", "Worker 132", "Worker 133", "Worker 134",
"Worker 135", "Worker 136", "Worker 137", "Worker 138",
"Worker 139", "Worker 140"), class = "factor")), row.names = c(7052L,
7053L, 7054L, 7055L, 7074L, 1767L, 21L, 58L), class = "data.frame")
答案 0 :(得分:1)
我正在使用您发布为dt
的数据:
library(tidyverse)
library(lubridate)
dt %>%
mutate(Start_shift = dmy_hm(Start_shift),
End_shift = dmy_hm(End_shift)) %>% # update to datetime
rowwise() %>% # for each row
mutate(date_vec = list(seq(Start_shift,
End_shift,
by = "15 mins"))) %>% # create a vector of 15 min distance date-times
ungroup() %>% # forget the grouping
unnest() %>% # unnest vector of date-times
mutate(time = substr(date_vec, 12,16)) %>% # keep only hr-mins
count(worker, time) %>% # count combinations
spread(time, n) # reshape
还有一种更紧凑的解决方案,它使用map
来代替rowwise
,同时生成日期时间向量并同时保持hr-mins:
dt %>%
mutate(Start_shift = dmy_hm(Start_shift),
End_shift = dmy_hm(End_shift),
time = map2(Start_shift, End_shift, ~substr(seq(.x, .y, by = "15 mins"), 12, 16))) %>%
unnest(time) %>%
count(worker, time) %>%
spread(time, n)