基于时间戳间隔的求和频率

时间:2016-05-09 18:10:33

标签: r timestamp dplyr

我有一个以下数据集,它提供特定时间戳的频率计数。

a <- read.table(header=TRUE, text="
Time Freq
7:00:36    3
7:00:55    0
7:02:18    8
7:02:54    3
7:04:20    6
7:04:36    0
7:05:52    4
7:06:17    0
7:07:47    3
7:08:03    0
                   ")
a  
      Time Freq
1  7:00:36    3
2  7:00:55    0
3  7:02:18    8
4  7:02:54    3
5  7:04:20    6
6  7:04:36    0
7  7:05:52    4
8  7:06:17    0
9  7:07:47    3
10 7:08:03    0

str(a)
'data.frame':   10 obs. of  2 variables:
 $ Time: Factor w/ 10 levels "7:00:36","7:00:55",..: 1 2 3 4 5 6 7 8 9 10
 $ Freq: int  3 0 8 3 6 0 4 0 3 0

a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))

str(a)
'data.frame':   10 obs. of  2 variables:
 $ Time: POSIXct, format: "2016-05-09 07:00:36" "2016-05-09 07:00:55" "2016-05-09 07:02:18" "2016-05-09 07:02:54" ...
 $ Freq: int  3 0 8 3 6 0 4 0 3 0

我想计算固定时间间隔的频率总和,如2 min。期望的结果如下:

           interval frequency
1 07:00:01-07:02:00         3
2 07:02:01-07:04:00        11
3 07:04:01-07:06:00        10
4 07:06:01-07:08:00         3
5 07:08:01-07:10:00         0

这是我的尝试:

library(dplyr)
interval <- 2

summary <- a %>%
  mutate(interval = floor((as.numeric(Time - min(Time)))/intrvl)+1) %>%
  group_by(interval, add = TRUE) %>%
  summarize(starttime = min(Time),
            frequency = n()) %>%
  select(-interval)
summary
Source: local data frame [10 x 2]

             starttime frequency
                (time)     (int)
1  2016-05-09 07:00:36         1
2  2016-05-09 07:00:55         1
3  2016-05-09 07:02:18         1
4  2016-05-09 07:02:54         1
5  2016-05-09 07:04:20         1
6  2016-05-09 07:04:36         1
7  2016-05-09 07:05:52         1
8  2016-05-09 07:06:17         1
9  2016-05-09 07:07:47         1
10 2016-05-09 07:08:03         1

1 个答案:

答案 0 :(得分:1)

使用cutaggregate的基本R方法将起作用:

a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))

# get a factor variable that contains separate levels for every 2 minute interval
a$interval <- cut(a$Time, breaks="2 min")
# aggregate the data, summing the frequencies
aggregate(Freq ~ interval, data=a, FUN=sum)