R:计算时间序列数据帧中特定时间窗口的平均值

时间:2016-04-05 20:44:26

标签: r

我的数据集间隔1分钟有点吵。因此,我希望每小时从25分钟到35分钟获得一个平均值,以便在30分钟内保持该小时。

例如,平均时间为:00:30(平均时间为00:25至00:35),01:30(平均时间为01:25至01:35),02:30(平均时间为02:25)到02:35)等。

你能在R中做到这一点吗?

这是我的数据集:

  set.seed(1)
  DateTime <- seq(as.POSIXct("2010/1/1 00:00"), as.POSIXct("2010/1/5 00:00"), "min")
  value <- rnorm(n=length(DateTime), mean=100, sd=1)
  df <- data.frame(DateTime, value)

非常感谢。

3 个答案:

答案 0 :(得分:3)

这是一种方式

library(dplyr)
df %>% 
  filter(between(as.numeric(format(DateTime, "%M")), 25, 35)) %>% 
  group_by(hour=format(DateTime, "%Y-%m-%d %H")) %>%
  summarise(value=mean(value))

答案 1 :(得分:2)

由于你想要平均每个时期的一个子集,我认为首先对data.frame进行子集,然后聚合是有意义的:

aggregate(
    value~cbind(time=strftime(DateTime,'%Y-%m-%d %H:30:00')),
    subset(df,{ m <- strftime(DateTime,'%M'); m>='25' & m<='35'; }),
    mean
);
##                   time     value
## 1  2010-01-01 00:30:00  99.82317
## 2  2010-01-01 01:30:00 100.58184
## 3  2010-01-01 02:30:00  99.54985
## 4  2010-01-01 03:30:00 100.47238
## 5  2010-01-01 04:30:00 100.05517
## 6  2010-01-01 05:30:00  99.96252
## 7  2010-01-01 06:30:00  99.79512
## 8  2010-01-01 07:30:00  99.06791
## 9  2010-01-01 08:30:00  99.58731
## 10 2010-01-01 09:30:00 100.27202
## 11 2010-01-01 10:30:00  99.60758
## 12 2010-01-01 11:30:00  99.92074
## 13 2010-01-01 12:30:00  99.65819
## 14 2010-01-01 13:30:00 100.04202
## 15 2010-01-01 14:30:00 100.04461
## 16 2010-01-01 15:30:00 100.11609
## 17 2010-01-01 16:30:00 100.08631
## 18 2010-01-01 17:30:00 100.41956
## 19 2010-01-01 18:30:00  99.98065
## 20 2010-01-01 19:30:00 100.07341
## 21 2010-01-01 20:30:00 100.20281
## 22 2010-01-01 21:30:00 100.86013
## 23 2010-01-01 22:30:00  99.68170
## 24 2010-01-01 23:30:00  99.68097
## 25 2010-01-02 00:30:00  99.58603
## 26 2010-01-02 01:30:00 100.10178
## 27 2010-01-02 02:30:00  99.78766
## 28 2010-01-02 03:30:00 100.02220
## 29 2010-01-02 04:30:00  99.83427
## 30 2010-01-02 05:30:00  99.74934
## 31 2010-01-02 06:30:00  99.99594
## 32 2010-01-02 07:30:00 100.08257
## 33 2010-01-02 08:30:00  99.47077
## 34 2010-01-02 09:30:00  99.81419
## 35 2010-01-02 10:30:00 100.13294
## 36 2010-01-02 11:30:00  99.78352
## 37 2010-01-02 12:30:00 100.04590
## 38 2010-01-02 13:30:00  99.91061
## 39 2010-01-02 14:30:00 100.61730
## 40 2010-01-02 15:30:00 100.18539
## 41 2010-01-02 16:30:00  99.45165
## 42 2010-01-02 17:30:00 100.09894
## 43 2010-01-02 18:30:00 100.04131
## 44 2010-01-02 19:30:00  99.58399
## 45 2010-01-02 20:30:00  99.75524
## 46 2010-01-02 21:30:00  99.94079
## 47 2010-01-02 22:30:00 100.26533
## 48 2010-01-02 23:30:00 100.35354
## 49 2010-01-03 00:30:00 100.31141
## 50 2010-01-03 01:30:00 100.10709
## 51 2010-01-03 02:30:00  99.41102
## 52 2010-01-03 03:30:00 100.07964
## 53 2010-01-03 04:30:00  99.88183
## 54 2010-01-03 05:30:00  99.91112
## 55 2010-01-03 06:30:00  99.71431
## 56 2010-01-03 07:30:00 100.48585
## 57 2010-01-03 08:30:00 100.35096
## 58 2010-01-03 09:30:00 100.00060
## 59 2010-01-03 10:30:00 100.03858
## 60 2010-01-03 11:30:00  99.95713
## 61 2010-01-03 12:30:00  99.18699
## 62 2010-01-03 13:30:00  99.49216
## 63 2010-01-03 14:30:00  99.37762
## 64 2010-01-03 15:30:00  99.68642
## 65 2010-01-03 16:30:00  99.84921
## 66 2010-01-03 17:30:00  99.84039
## 67 2010-01-03 18:30:00  99.90989
## 68 2010-01-03 19:30:00  99.95421
## 69 2010-01-03 20:30:00 100.01276
## 70 2010-01-03 21:30:00 100.14585
## 71 2010-01-03 22:30:00  99.54110
## 72 2010-01-03 23:30:00 100.02526
## 73 2010-01-04 00:30:00 100.04476
## 74 2010-01-04 01:30:00  99.61132
## 75 2010-01-04 02:30:00  99.94782
## 76 2010-01-04 03:30:00  99.44863
## 77 2010-01-04 04:30:00  99.91305
## 78 2010-01-04 05:30:00 100.25428
## 79 2010-01-04 06:30:00  99.86279
## 80 2010-01-04 07:30:00  99.63516
## 81 2010-01-04 08:30:00  99.65747
## 82 2010-01-04 09:30:00  99.57810
## 83 2010-01-04 10:30:00  99.77603
## 84 2010-01-04 11:30:00  99.85140
## 85 2010-01-04 12:30:00 100.82995
## 86 2010-01-04 13:30:00 100.26138
## 87 2010-01-04 14:30:00 100.25851
## 88 2010-01-04 15:30:00  99.92685
## 89 2010-01-04 16:30:00 100.00825
## 90 2010-01-04 17:30:00 100.24437
## 91 2010-01-04 18:30:00  99.62711
## 92 2010-01-04 19:30:00  99.93999
## 93 2010-01-04 20:30:00  99.82477
## 94 2010-01-04 21:30:00 100.15321
## 95 2010-01-04 22:30:00  99.88370
## 96 2010-01-04 23:30:00 100.06657

答案 2 :(得分:2)

我认为现有的答案不够通用,因为它们没有考虑时间间隔可能落在多个中点内。

我会使用shift包中的data.table

library(data.table)
setDT(df)

首先根据您在上面选择的序列设置区间参数。这会计算表格中每行的平均十行(分钟):

df[, ave_val :=  
     Reduce('+',c(shift(value, 0:5L, type = "lag"),shift(value, 1:5L, type = "lead")))/11
   ]

然后生成您想要的中点:

mids <- seq(as.POSIXct("2010/1/1 00:00"), as.POSIXct("2010/1/5 00:00"), by = 60*60) + 30*60 # every hour starting at 0:30

然后过滤:

setkey(df,DateTime)
df[J(mids)]
相关问题