我有一个以下数据集,它提供特定时间戳的频率计数。
a <- read.table(header=TRUE, text="
Time Freq
7:00:36 3
7:00:55 0
7:02:18 8
7:02:54 3
7:04:20 6
7:04:36 0
7:05:52 4
7:06:17 0
7:07:47 3
7:08:03 0
")
a
Time Freq
1 7:00:36 3
2 7:00:55 0
3 7:02:18 8
4 7:02:54 3
5 7:04:20 6
6 7:04:36 0
7 7:05:52 4
8 7:06:17 0
9 7:07:47 3
10 7:08:03 0
str(a)
'data.frame': 10 obs. of 2 variables:
$ Time: Factor w/ 10 levels "7:00:36","7:00:55",..: 1 2 3 4 5 6 7 8 9 10
$ Freq: int 3 0 8 3 6 0 4 0 3 0
a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))
str(a)
'data.frame': 10 obs. of 2 variables:
$ Time: POSIXct, format: "2016-05-09 07:00:36" "2016-05-09 07:00:55" "2016-05-09 07:02:18" "2016-05-09 07:02:54" ...
$ Freq: int 3 0 8 3 6 0 4 0 3 0
我想计算固定时间间隔的频率总和,如2 min
。期望的结果如下:
interval frequency
1 07:00:01-07:02:00 3
2 07:02:01-07:04:00 11
3 07:04:01-07:06:00 10
4 07:06:01-07:08:00 3
5 07:08:01-07:10:00 0
这是我的尝试:
library(dplyr)
interval <- 2
summary <- a %>%
mutate(interval = floor((as.numeric(Time - min(Time)))/intrvl)+1) %>%
group_by(interval, add = TRUE) %>%
summarize(starttime = min(Time),
frequency = n()) %>%
select(-interval)
summary
Source: local data frame [10 x 2]
starttime frequency
(time) (int)
1 2016-05-09 07:00:36 1
2 2016-05-09 07:00:55 1
3 2016-05-09 07:02:18 1
4 2016-05-09 07:02:54 1
5 2016-05-09 07:04:20 1
6 2016-05-09 07:04:36 1
7 2016-05-09 07:05:52 1
8 2016-05-09 07:06:17 1
9 2016-05-09 07:07:47 1
10 2016-05-09 07:08:03 1
答案 0 :(得分:1)
使用cut
和aggregate
的基本R方法将起作用:
a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))
# get a factor variable that contains separate levels for every 2 minute interval
a$interval <- cut(a$Time, breaks="2 min")
# aggregate the data, summing the frequencies
aggregate(Freq ~ interval, data=a, FUN=sum)