如何将数据帧转换为R中的每小时时间序列

时间:2018-06-23 07:05:34

标签: r

我在R中有以下数据框

   hourly_calls                 total_calls
   2017-12-01 08:00-08:59       39
   2017-12-01 09:00-09:59       29
   2017-12-01 10:00-10:59       57
   2017-12-01 11:00-11:59       90
   2017-12-01 12:00-12:59       23
   2017-12-01 13:00-13:59       45
   2017-12-01 14:00-14:59       54
   2017-12-01 15:00-15:59       39
   2017-12-01 16:00-16:59       29
   2017-12-01 17:00-17:00       27
   2017-12-04 08:00-08:59       49
   2017-12-04 09:00-09:59       69
   2017-12-04 10:00-10:59       27
   2017-12-04 11:00-11:59       60
   2017-12-04 12:00-12:59       23
   2017-12-04 13:00-13:59       85
   2017-12-04 14:00-14:59       14
   2017-12-04 15:00-15:59       39
   2017-12-04 16:00-16:59       59
   2017-12-04 17:00-17:00       67

这是呼叫中心每小时呼叫量的数据帧(每周9天,每5天9班)。我想将此数据帧转换为每小时的时间序列,以便可以预测下几个小时。

这就是我的做法

 train <- df[1:1152,]
 test < df[1153:1206,]
 train <- msts(train[['total_calls']], seasonal.periods=c(9))
 test <- msts(test[['total_calls']], seasonal.periods=c(9))

我如何在r中做到这一点?

1 个答案:

答案 0 :(得分:0)

数据中的主要问题是第一列hourly_calls代表时间范围,而不仅仅是时间。因此,它不会自动转换为date-time以准备ts。一种选择是只考虑Start Time部分并准备时间序列。

library(tidyverse)
library(lubridate)
library(xts)
library(forecast)


#Get the start time first
data <- df %>% extract(hourly_calls, 
c("StartTm", "EndTm"), regex = "(^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2})-(\\d{2}:\\d{2})") %>%
  mutate(StartTm = ymd_hm(StartTm))

#Only StartTm has been considered for this  
xtsData <- xts(data$total_calls, order.by = data$StartTm)

train <- xtsData[1:1152,]
test <- xtsData[1153:1206,]

trainTS <- ts(train, freq=9) #9 hours a day
fit <- tslm(trainTS ~ season + trend) 

forecast(fit, newdata = data.frame(x=test))

数据:

df <- read.table(text =
"hourly_calls                 total_calls
'2017-12-01 08:00-08:59'       39
'2017-12-01 09:00-09:59'       29
'2017-12-01 10:00-10:59'       57
'2017-12-01 11:00-11:59'       90
'2017-12-01 12:00-12:59'       23
'2017-12-01 13:00-13:59'       45
'2017-12-01 14:00-14:59'       54
'2017-12-01 15:00-15:59'       39
'2017-12-01 16:00-16:59'       29
'2017-12-01 17:00-17:00'       27
'2017-12-04 08:00-08:59'       49
'2017-12-04 09:00-09:59'       69
'2017-12-04 10:00-10:59'       27
'2017-12-04 11:00-11:59'       60
'2017-12-04 12:00-12:59'       23
'2017-12-04 13:00-13:59'       85
'2017-12-04 14:00-14:59'       14
'2017-12-04 15:00-15:59'       39
'2017-12-04 16:00-16:59'       59
'2017-12-04 17:00-17:00'       67",
header = TRUE, stringsAsFactors = FALSE)