按时间间隔变化分组数据(3,6,12,24小时)

时间:2014-11-09 19:07:12

标签: r

我必须分析一些每小时数据,我想知道我是否可以使用R将数据分组3,6,12,24小时。我尝试使用aggregate,但我没有成功。有人知道怎么做吗?

1 个答案:

答案 0 :(得分:2)

我不确定您的数据是什么样的,所以我会使用一个简单的数据集来展示data.table包的一些可能性:

# options(continue="  ")
library(data.table)
##
set.seed(123)
Dt <- data.table(
  Date=rep((Sys.Date()-364)+0:364,each=24),
  Hour=rep(0:23,365),
  Value=rnorm(24*365))

您可以使用整数除法%/%

按不同时间间隔进行分组
# 3 Hours
> Dt[,list(Mean=mean(Value)),
     by=list(FromHour=3*(Hour%/%3),
             ToHour=(3*(Hour%/%3))+2)]
   FromHour ToHour         Mean
1:        0      2  0.035449102
2:        3      5  0.036266830
3:        6      8 -0.013018137
4:        9     11 -0.024109474
5:       12     14 -0.019402564
6:       15     17 -0.009076756
7:       18     20  0.040802064
8:       21     23 -0.015103750
# 6 Hours
> Dt[,list(Mean=mean(Value)),
     by=list(FromHour=6*(Hour%/%6),
             ToHour=(6*(Hour%/%6))+5)]
   FromHour ToHour        Mean
1:        0      5  0.03585797
2:        6     11 -0.01856381
3:       12     17 -0.01423966
4:       18     23  0.01284916

或者添加了每日维度

> Dt[,list(Mean=mean(Value)),
     by=list(Date,
             FromHour=3*(Hour%/%3),
             ToHour=(3*(Hour%/%3))+2)]
            Date FromHour ToHour        Mean
   1: 2013-11-10        0      2  0.25601839
   2: 2013-11-10        3      5  0.63828704
   3: 2013-11-10        6      8 -0.49699929
   4: 2013-11-10        9     11  0.37941122
   5: 2013-11-10       12     14 -0.01479566
  ---                                       
2916: 2014-11-09        9     11  0.69827715
2917: 2014-11-09       12     14 -0.40997757
2918: 2014-11-09       15     17 -0.36256883
2919: 2014-11-09       18     20  0.43272162
2920: 2014-11-09       21     23  0.67656169

如果您想获得更全面的汇总,例如使用summary,您可以执行以下操作:

summaryNames <- c(
    "Min","Q1","Med",
    "Mean","Q3","Max")
#
Hourly.6 <- Dt[,sapply(.SD,function(x){
    as.list(summary(x))
  }),
  .SDcols="Value",
  by=list(FromHour=6*(Hour%/%6),
          ToHour=(6*(Hour%/%6))+5)]
setnames(Hourly.6,3:8,summaryNames)
> head(Hourly.6)
   FromHour ToHour    Min      Q1       Med     Mean     Q3   Max
1:        0      5 -3.414 -0.6494  0.037340  0.03586 0.7127 3.421
2:        6     11 -3.189 -0.6488 -0.023940 -0.01856 0.6187 3.272
3:       12     17 -3.467 -0.6743 -0.015590 -0.01424 0.6879 3.446
4:       18     23 -3.845 -0.6761  0.002263  0.01285 0.7258 3.848
# 
Daily.12h <- Dt[,sapply(.SD,function(x){
    as.list(summary(x))
  }),
  .SDcols="Value",
  by=list(Date,
          FromHour=12*(Hour%/%12),
          ToHour=(12*(Hour%/%12))+11)]
setnames(Daily.12h,4:9,summaryNames)
> head(Daily.12h)
         Date FromHour ToHour    Min      Q1     Med     Mean      Q3   Max
1: 2013-11-10        0     11 -1.265 -0.4744  0.0999  0.19420 0.65170 1.715
2: 2013-11-10       12     23 -1.967 -0.8032 -0.3454 -0.21150 0.42500 1.787
3: 2013-11-11        0     11 -1.687 -0.3776  0.5576  0.18420 0.84790 1.254
4: 2013-11-11       12     23 -1.265 -0.5237 -0.3432 -0.08151 0.09205 2.169
5: 2013-11-12        0     11 -1.549 -0.0530  0.1699  0.24280 0.63350 1.516
6: 2013-11-12       12     23 -2.309 -0.6314 -0.1401 -0.13080 0.39680 2.050