我有以下数据结构:
myDF <- data.frame(as.POSIXct(c("2010-02-16 12:45:37 CST",
"2010-02-16 13:22:23 CST",
"2010-02-16 13:49:47 CST",
"2010-02-16 14:23:13 CST",
"2010-02-16 16:29:17 CST",
"2010-02-16 16:49:26 CST")))
colnames(myDF) <- c("DateTimeArrival")
如何获取12:00:00至12:59:59之间发生的事件数量。我想要以下结果:
Time Number
12:59:59 1
13:59:59 1
14:59:59 1
15:59:59 0
16:59:59 2
答案 0 :(得分:2)
我们也可以使用foverlaps
library(data.table)
library(data.table)
## create table of intervals
dt_intervals <- data.table(start_interval = seq(as.POSIXct("2010-02-16 11:00:00"), as.POSIXct("2010-02-16 17:00:00"), by="hour"),
end_interval = seq(as.POSIXct("2010-02-16 11:59:59"), as.POSIXct("2010-02-16 17:59:59"), by="hour"))
## set our df to a data.table
myDT <- data.table(myDF)
myDT[, DateTimeArrival_copy := DateTimeArrival]
setkey(dt_intervals, start_interval, end_interval)
setkey(myDT, DateTimeArrival, DateTimeArrival_copy)
foverlaps(dt_intervals,
myDT,
type="any")[, sum(!is.na(DateTimeArrival)), by=end_interval]
# end_interval V1
#1: 2010-02-16 11:59:59 0
#2: 2010-02-16 12:59:59 1
#3: 2010-02-16 13:59:59 2
#4: 2010-02-16 14:59:59 1
#5: 2010-02-16 15:59:59 0
#6: 2010-02-16 16:59:59 2
#7: 2010-02-16 17:59:59 0
答案 1 :(得分:2)
以下是使用cut
和table
的另一种解决方案:
> as.data.frame(table(cut(myDF$DateTimeArrival+3600,
breaks=seq(as.POSIXct("2010-02-16 11:59:59 PST"),
by="1 hour", length.out=7))))
Var1 Freq
1 2010-02-16 11:59:59 0
2 2010-02-16 12:59:59 1
3 2010-02-16 13:59:59 2
4 2010-02-16 14:59:59 1
5 2010-02-16 15:59:59 0
6 2010-02-16 16:59:59 2
自11:59到12:59默认编码为11:59,但你希望它编码为12:59所以我总是加3600
(1小时)。
答案 2 :(得分:1)
使用trunc
将日期/时间缩短回小时,然后aggregate
trtimes <- as.POSIXct(trunc(myDF$DateTimeArrival, units="hours")) + 3599
aggregate(
Count ~ Time,
merge(
list(Time=seq(min(trtimes), max(trtimes), by="hour")),
list(Time=trtimes, Count=1),
all.x=TRUE
),
FUN=sum,
na.rm=TRUE,
na.action=na.pass
)
# Time Count
#1 2010-02-16 12:59:59 1
#2 2010-02-16 13:59:59 2
#3 2010-02-16 14:59:59 1
#4 2010-02-16 15:59:59 0
#5 2010-02-16 16:59:59 2