Question

我有500万以上的约会数据行（开始/停止时间），我希望将其转换为15分钟的块以用于需求预测和调度。

示例：

Start time: 9:30

Stop time: 10:10

我希望在9:30-9:44，9:45-9:59，10:00-10:14列中填充一个，而其他93列在该特定行中的填充为零。

谢谢。

Answer 1

> dput <- structure(
+     list(
+         start = structure(c(1539764520, 1539763920, 1539765180, 1539765180, 1539764400, 1539764400), class = c("POSIXct", "POSIXt" ), tzone = ""), 
+         stop = structure(c(1539769320, 1539777420, 1539803940, 1539803940, 1539770700, 1539770700), class = c("POSIXct", "POSIXt" ), tzone = "")), 
+     row.names = c(NA, 6L), class = "data.frame")
> dput
                start                stop
1 2018-10-17 17:22:00 2018-10-17 18:42:00
2 2018-10-17 17:12:00 2018-10-17 20:57:00
3 2018-10-17 17:33:00 2018-10-18 04:19:00
4 2018-10-17 17:33:00 2018-10-18 04:19:00
5 2018-10-17 17:20:00 2018-10-17 19:05:00
6 2018-10-17 17:20:00 2018-10-17 19:05:00

请参阅下文，您也可以更改为ceiling_date或floor_date：

> dput %>% mutate_all(round_date, '15 mins')
                start                stop
1 2018-10-17 17:15:00 2018-10-17 18:45:00
2 2018-10-17 17:15:00 2018-10-17 21:00:00
3 2018-10-17 17:30:00 2018-10-18 04:15:00
4 2018-10-17 17:30:00 2018-10-18 04:15:00
5 2018-10-17 17:15:00 2018-10-17 19:00:00
6 2018-10-17 17:15:00 2018-10-17 19:00:00

Answer 2

好的，这可能有效。您的数据在这里称为df。这种方法取决于lubridate的int_overlaps函数的使用，该函数可以检测约会和您指定的间隔（块）之间是否存在重叠。

library(tidyverse)
library(lubridate)

no_intervals <- 95  #number of intervals
intervals_start <- ymd_hms("2018-10-17 10:00:00")
intervals_width <- 15 #in minutes


#define intervals for the blocks you want to populate
blocks <- lapply(1:no_intervals, function(shift){
  interval((intervals_start + (shift-1) * minutes(intervals_width)), 
           (intervals_start + (shift)   * minutes(intervals_width)))}) %>% 
  `names<-`(paste0("int", 1 : no_intervals))

#add the overlaps of the appointments with the blocks to the df
res<- df %>% 
  mutate(appointment = interval(ymd_hms(df$start), ymd_hms(df$stop))) %>% 
  cbind(as.data.frame(lapply(blocks, int_overlaps, .$appointment))) %>% 
  mutate_at(vars(matches("^int")), as.numeric) #convert booleans to 0/1

将开始/停止时间转换为数据帧以计算并发性的最佳方法

2 个答案: