计算R中每分钟的平均值

时间:2016-08-24 12:42:09

标签: r date data.table average

我有data.table有两个参数(日期和值),现在我想计算每分钟(或每15分钟)的平均值。

  • 我起初认为我应该把日期分成小时和分钟
  • 然后计算间隔时间的平均值

但我真的不知道该怎么做,也许你会对它有所了解

例如,简单数据。

date                Value
2015-07-01 00:00:23 1.83
2015-07-01 00:00:24 1.68
2015-07-01 00:00:25 1.29
2015-07-01 00:00:40 14.23
2015-07-01 00:00:41 0.96
2015-07-01 00:00:46 4.93
2015-07-01 00:01:12 26.44
2015-07-01 00:02:02 49.66
2015-07-01 00:02:05 3.00
2015-07-01 00:02:08 3.19
2015-07-01 00:02:27 19.42
2015-07-01 00:02:32 4.44
2015-07-01 00:02:45 12.77
2015-07-01 00:02:49 4.44
2015-07-01 00:03:40 50.71
2015-07-01 00:03:50 10.64
2015-07-01 00:03:52 1.18
2015-07-01 00:03:52 0.99
2015-07-01 00:03:54 1.32
2015-07-01 00:03:56 2.20

以下是生成测试数据的代码:

dd <- data.table(date = c("2015-07-01 00:00:23", "2015-07-01 00:00:24", "2015-07-01 00:00:25","2015-07-01 00:00:40", "2015-07-01 00:00:46","2015-07-01 00:01:12","2015-07-01 00:02:02","2015-07-01 00:02:08","2015-07-01 00:02:27","2015-07-01 00:02:32","2015-07-01 00:02:45","2015-07-01 00:02:49","2015-07-01 00:03:40","2015-07-01 00:03:50","2015-07-01 00:03:52","2015-07-01 00:03:54","2015-07-01 00:03:56"),
             value = c(1.83,1.68,1.29,14.23,0.96,4.93,26.44,3.00,3.19,19.42,4.44,50.71,10.64,1.18,0.99,1.32,2.20))

2 个答案:

答案 0 :(得分:4)

因为当你说“按季度”时你的意思是“一刻钟”,那么我会将你的data.table转换为xts对象并使用xts::period.apply

library(xts)
x <- as.xts(dd[,date := as.POSIXct(date)])
period.apply(x, endpoints(x, "minutes", 15), mean)
#                        value
# 2015-07-01 00:03:56 8.732353

如果按“季度”表示“一年四分之一”,那么您可以使用我原来的答案:

您可以使用zoo::yearqtr创建季度时间值来聚合。然后使用正常的data.table聚合步骤。

dd[, avg := mean(value), by = zoo::as.yearqtr(dd$date, "%Y-%m-%d")]

答案 1 :(得分:3)

我们可以使用minute包中的lubridate函数。请注意,data.table具有hour功能。

我们可以使用cut功能将分钟格式化为几个小时。

library(lubridate)
dd[, c('Hour', 'Minute') := .(data.table::hour(date), minute(date))
 ][, Minute_Cut := cut(Minute, breaks = c(0,15,30,45,60), include.lowest = T)
 ][, .(Avg = mean(value)), .(Hour, Minute_Cut)]

#    Hour Minute_Cut      Avg
# 1:    0     [0,15] 8.732353

如果您只想按每分钟计算一次,我们可以避开cut步骤:

dd[, c('Hour', 'Minute') := .(data.table::hour(date), minute(date))
 ][, .(Avg = mean(value)), .(Hour, Minute)]

#    Hour Minute      Avg
# 1:    0      0  3.99800
# 2:    0      1  4.93000
# 3:    0      2 17.86667
# 4:    0      3  3.26600