使用因子变量查找重叠和不重叠期间的开始和结束时间

时间:2018-10-04 20:02:13

标签: r time data.table overlap

此问题从较早版本开始: How to join 2 data tables by time interval and summarize overlapping and non-overlapping time periods by factor variable

除了持续时间(前面问题的答案)之外,我还想找到所有重叠和不重叠时间段的开始和结束时间。

使用与以前相同的示例数据,并结合先前问题获得的帮助。

library( data.table )
library( lubridate )

set.seed(13)
EffortType = sample(c("A","B","C"), 100, replace = TRUE)
On = sample(seq(as.POSIXct('2016/01/01 01:00:00'), as.POSIXct('2016/01/03 01:00:00'), by = "1 sec"), 100, replace=F)
Off = On + minutes(sample(1:60, 100, replace=T))
Effort1 = data.table(EffortType, On, Off)

EffortType2 = sample(c("A","B","C"), 100, replace = TRUE)
On2 = sample(seq(as.POSIXct('2016/01/01 12:00:00'), as.POSIXct('2016/01/03 12:00:00'), by = "1 sec"), 100, replace=F)
Off2 = On2 + minutes(sample(1:60, 100, replace=T))
Effort2 = data.table(EffortType2, On2, Off2)

#create DT of seconds, spanning entire period - employing Wimpel's approach
dt.secs <- data.table( On = seq(min(Effort1$On, Effort2$On2),
                                max(Effort1$Off, Effort2$Off2)+60*60,
                                by= "1 sec"),
                       Off = seq(min(Effort1$On, Effort2$On2),
                                 max(Effort1$Off, Effort2$Off2)+60*60,
                                 by= "1 sec") +1)

#prep for using foverlaps
setkey(Effort1, On, Off)
setkey(Effort2, On2, Off2)
setkey(dt.secs, On, Off)

#overlap join both efforts on the dt.secs. 
s1 <- foverlaps(dt.secs, Effort1 ,type="within",nomatch=0L)
s2 <- foverlaps(dt.secs, Effort2 ,type="within",nomatch=0L)

#bind together
result <- rbindlist(list(s1,s2))[, `:=`(On=i.On, Off = i.Off)][, `:=`(i.On = NULL, i.Off = NULL)]

因此,现在我有了一个数据表,该表列出了我进行了某种努力的所有秒数。而且我可以找出每秒的工作量组合

OnDT = result[,
               .(tt = paste(sort(unique(EffortType)), collapse=" "))
               , keyby=On]

但是我对于如何将具有相同努力组合的连续秒转换为具有每个间隔的开始和停止时间的间隔感到困惑。

如果我曾经使用过

matches = foverlaps(Effort1,Effort2,type="any",nomatch=0L)

然后我会使用

为每个间隔寻找新的开始和停止时间
matches$start = pmax(matches$On, matches$i.On, na.rm=T)
matches$end = pmin(matches$Off, matches$i.Off, na.rm=T)

我想得到一个与比赛类似的数据表,但它只包含单一工作量的时间。我确实尝试将f.lap设置为no.match = NA,但这并没有给我所有不重叠的工作时间。

0 个答案:

没有答案