我有一个像这样的不均匀时间间隔的数据。
x=data.frame(date=rep('2014-07-24',5),from=c("14:12","14:12","14:30","14:24","14:32"),to=c("15:25","15:40","15:35","15:50","15:55"),Load=c(2,2,1,1,1))
'from'和'to'列表示该时间间隔内负载相应波动的开始和结束时间。 我希望将此数据转换为相应日期的15分钟间隔(96个块)。因此,如果在该间隔(从 - 到)中存在间隔14:15-14:30,则将为其分配该负载值。如果它也存在于另一个间隔中,则在此间隔内将进一步添加负载值。
如果在12:40-13:45这样的不均匀间隔中存在00:00-00:15(及其他)间隔,我可以通过R进行比较,以便我可以相应地安排数据像这样。
y=data.frame(date=rep('2014-07-24'),block=c("14:15-14:30","14:30-14:45","14:45-15:00","15:00-15:15","15:15-15:30"),load=c(4,7,7,7,7))
请帮忙。 非常感谢
答案 0 :(得分:1)
使用 data.table 中的foverlaps
我会按如下方式处理:
1)为两个数据表获取正确的日期时间列:
x[, `:=` (from = as.POSIXct(paste(date,from)), to = as.POSIXct(paste(date,to)), date = NULL)]
y[, c("start","end") := tstrsplit(block, "-", fixed=TRUE)
][, `:=` (start = as.POSIXct(paste(date,start)),
end = as.POSIXct(paste(date,end)),
block = NULL, date = NULL)]
2)设置键:
setkey(x, from, to)
setkey(y, start, end)
3)查找x
和y
之间的重叠并获取最大值:
x.new <- foverlaps(y, x, type = "within")[, .(load.new = max(pmax(Load,load))),
by = .(from, to)]
这些步骤导致:
> x.new
from to load.new
1: 2014-07-24 14:12:00 2014-07-24 15:25:00 7
2: 2014-07-24 14:12:00 2014-07-24 15:40:00 7
3: 2014-07-24 14:24:00 2014-07-24 15:50:00 7
4: 2014-07-24 14:30:00 2014-07-24 15:35:00 7
5: 2014-07-24 14:32:00 2014-07-24 15:55:00 7
使用过的数据:
x <- data.table(date=rep('2014-07-24',5),
from=c("14:12","14:12","14:30","14:24","14:32"),
to=c("15:25","15:40","15:35","15:50","15:55"),
Load=c(2,2,1,1,1))
y <- data.table(date=rep('2014-07-24'),
block=c("14:15-14:30","14:30-14:45","14:45-15:00","15:00-15:15","15:15-15:30"),
load=c(4,7,7,7,7))