“POSIXct”“POSIXt”计算事件的数量

时间:2016-03-15 22:11:06

标签: r date

我有以下数据结构:

myDF <- data.frame(as.POSIXct(c("2010-02-16 12:45:37 CST", 
                                "2010-02-16 13:22:23 CST", 
                                "2010-02-16 13:49:47 CST", 
                                "2010-02-16 14:23:13 CST", 
                                "2010-02-16 16:29:17 CST",
                                "2010-02-16 16:49:26 CST")))
colnames(myDF) <- c("DateTimeArrival")

如何获取12:00:00至12:59:59之间发生的事件数量。我想要以下结果:

Time       Number
12:59:59   1
13:59:59   1
14:59:59   1
15:59:59   0
16:59:59   2

3 个答案:

答案 0 :(得分:2)

我们也可以使用foverlaps

中的library(data.table)
library(data.table)

## create table of intervals
dt_intervals <- data.table(start_interval = seq(as.POSIXct("2010-02-16 11:00:00"), as.POSIXct("2010-02-16 17:00:00"), by="hour"),
                           end_interval = seq(as.POSIXct("2010-02-16 11:59:59"), as.POSIXct("2010-02-16 17:59:59"), by="hour"))

## set our df to a data.table
myDT <- data.table(myDF)
myDT[, DateTimeArrival_copy := DateTimeArrival]

setkey(dt_intervals, start_interval, end_interval)
setkey(myDT, DateTimeArrival, DateTimeArrival_copy)

foverlaps(dt_intervals,
          myDT,
         type="any")[, sum(!is.na(DateTimeArrival)), by=end_interval]

#          end_interval V1
#1: 2010-02-16 11:59:59  0
#2: 2010-02-16 12:59:59  1
#3: 2010-02-16 13:59:59  2
#4: 2010-02-16 14:59:59  1
#5: 2010-02-16 15:59:59  0
#6: 2010-02-16 16:59:59  2
#7: 2010-02-16 17:59:59  0

答案 1 :(得分:2)

以下是使用cuttable的另一种解决方案:

> as.data.frame(table(cut(myDF$DateTimeArrival+3600,
  breaks=seq(as.POSIXct("2010-02-16 11:59:59 PST"),
  by="1 hour", length.out=7))))
                 Var1 Freq
1 2010-02-16 11:59:59    0
2 2010-02-16 12:59:59    1
3 2010-02-16 13:59:59    2
4 2010-02-16 14:59:59    1
5 2010-02-16 15:59:59    0
6 2010-02-16 16:59:59    2

自11:59到12:59默认编码为11:59,但你希望它编码为12:59所以我总是加3600(1小时)。

答案 2 :(得分:1)

使用trunc将日期/时间缩短回小时,然后aggregate

trtimes <- as.POSIXct(trunc(myDF$DateTimeArrival, units="hours")) + 3599

aggregate(
  Count ~ Time,
  merge(
    list(Time=seq(min(trtimes), max(trtimes), by="hour")),
    list(Time=trtimes, Count=1),
    all.x=TRUE
  ),
  FUN=sum, 
  na.rm=TRUE,
  na.action=na.pass
)

#                 Time Count
#1 2010-02-16 12:59:59     1
#2 2010-02-16 13:59:59     2
#3 2010-02-16 14:59:59     1
#4 2010-02-16 15:59:59     0
#5 2010-02-16 16:59:59     2