如何将不同日子的时间装入时间箱

时间:2015-07-02 19:43:34

标签: r

我有不同日期的时间和数量的数据框:

t= as.POSIXct(c("2015-06-15 18:22:57" , "2015-06-15 18:22:57" ,"2015-06-16 13:10:57"  ,"2015-06-17 02:32:57","2015-06-01 20:59:13","2015-06-06 06:54:13","2015-06-15 01:01:13","2015-06-29 08:01:13"),tz="GMT")   
t
q= c(100,100,100,100,100,100,100,100)
data.frame(time= t, q= q)
                time   q
1 2015-06-15 18:22:57 100
2 2015-06-15 18:22:57 100
3 2015-06-16 13:10:57 100
4 2015-06-17 02:32:57 100
5 2015-06-01 20:59:13 100
6 2015-06-06 06:54:13 100
7 2015-06-15 01:01:13 100
8 2015-06-29 08:01:13 100

我想将这些数据绑定为15分钟增量,如:

Bin          q
00:15:00
00:30:00
00:45:00
01:00:00
01:15:00     100  # this is the quantity at  2015-06-15 01:01:13
01:30:00
01:45:00
02:00:00
02:15:00
02:30:00
02:45:00     100  # this is the quantity at 2015-06-17 02:32:57
03:00:00
03:15:00
03:30:00
03:45:00
04:00:00
04:15:00
04:30:00
04:45:00
05:00:00
05:15:00
05:30:00
05:45:00
06:00:00
06:15:00
06:30:00
06:45:00
07:00:00

.....
.....
.....
    18:15:00
    18:30:00       200  #this is the sum of 2 quantities at 2015-06-15 18:22:57
    18:45:00
    19:00:00

因此,如果数量落入同一时间段,则会添加它们。知道怎么做吗?

谢谢。

2 个答案:

答案 0 :(得分:2)

您可以将cutseq与时间对象一起使用。这里,垃圾箱的开始时间标有。

dat <- data.frame(time= t, q= q)

## Bin data by 15 minute intervals
min_time <- min(t) - difftime(min(t), as.Date(min(t))) # start from beginning of day
breaks <- seq.POSIXt(min_time, max(t)+15*60, by="15 min")
dat$bins <- cut(t, breaks=breaks)

## Summarise by bins
library(dplyr)
dat %>% group_by(bins) %>%
  summarise(sums = sum(q, na.rm=T))

#                  bins sums
# 1 2015-06-01 20:45:00  100
# 2 2015-06-06 06:45:00  100
# 3 2015-06-15 01:00:00  100
# 4 2015-06-15 18:15:00  200
# 5 2015-06-16 13:00:00  100
# 6 2015-06-17 02:30:00  100
# 7 2015-06-29 08:00:00  100

答案 1 :(得分:0)

如果您想按日期划分,无论日期如何,那么可能更容易提取时间并使用它。

dat = data.frame(time=t, q=q)

library(lubridate)
library(plyr)

# Extract time of day from each date-time
dat$hour = hour(dat$time) + minute(dat$time)/60 + second(dat$time)/3600

# Create bin labels
bins=c(paste0(rep(c(paste0(0,0:9),10:23), each=4),":", c("00",15,30,45))[-1],"24:00")

# Bin the data
dat$bins = cut(dat$hour, breaks=seq(0, 24, 0.25), labels=bins)

这是按时间段汇总的结果:

ddply(dat, .(bins), summarise, q_sum = sum(q), .drop=FALSE)

    bins q_sum
1  00:15     0
2  00:30     0
3  00:45     0
4  01:00     0
5  01:15   100
6  01:30     0
...
10 02:30     0
11 02:45   100
12 03:00     0
...
27 06:45     0
28 07:00   100
29 07:15     0
30 07:30     0
31 07:45     0
32 08:00     0
33 08:15   100
34 08:30     0
...
52 13:00     0
53 13:15   100
54 13:30     0
55 13:45     0
...
72 18:00     0
73 18:15     0
74 18:30   200
75 18:45     0
...
82 20:30     0
83 20:45     0
84 21:00   100
85 21:15     0
86 21:30     0
...
95 23:45     0
96 24:00     0