我在R中有以下数据框
hourly_calls total_calls
2017-12-01 08:00-08:59 39
2017-12-01 09:00-09:59 29
2017-12-01 10:00-10:59 57
2017-12-01 11:00-11:59 90
2017-12-01 12:00-12:59 23
2017-12-01 13:00-13:59 45
2017-12-01 14:00-14:59 54
2017-12-01 15:00-15:59 39
2017-12-01 16:00-16:59 29
2017-12-01 17:00-17:00 27
2017-12-04 08:00-08:59 49
2017-12-04 09:00-09:59 69
2017-12-04 10:00-10:59 27
2017-12-04 11:00-11:59 60
2017-12-04 12:00-12:59 23
2017-12-04 13:00-13:59 85
2017-12-04 14:00-14:59 14
2017-12-04 15:00-15:59 39
2017-12-04 16:00-16:59 59
2017-12-04 17:00-17:00 67
这是呼叫中心每小时呼叫量的数据帧(每周9天,每5天9班)。我想将此数据帧转换为每小时的时间序列,以便可以预测下几个小时。
这就是我的做法
train <- df[1:1152,]
test < df[1153:1206,]
train <- msts(train[['total_calls']], seasonal.periods=c(9))
test <- msts(test[['total_calls']], seasonal.periods=c(9))
我如何在r中做到这一点?
答案 0 :(得分:0)
数据中的主要问题是第一列hourly_calls
代表时间范围,而不仅仅是时间。因此,它不会自动转换为date-time
以准备ts
。一种选择是只考虑Start Time
部分并准备时间序列。
library(tidyverse)
library(lubridate)
library(xts)
library(forecast)
#Get the start time first
data <- df %>% extract(hourly_calls,
c("StartTm", "EndTm"), regex = "(^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2})-(\\d{2}:\\d{2})") %>%
mutate(StartTm = ymd_hm(StartTm))
#Only StartTm has been considered for this
xtsData <- xts(data$total_calls, order.by = data$StartTm)
train <- xtsData[1:1152,]
test <- xtsData[1153:1206,]
trainTS <- ts(train, freq=9) #9 hours a day
fit <- tslm(trainTS ~ season + trend)
forecast(fit, newdata = data.frame(x=test))
数据:
df <- read.table(text =
"hourly_calls total_calls
'2017-12-01 08:00-08:59' 39
'2017-12-01 09:00-09:59' 29
'2017-12-01 10:00-10:59' 57
'2017-12-01 11:00-11:59' 90
'2017-12-01 12:00-12:59' 23
'2017-12-01 13:00-13:59' 45
'2017-12-01 14:00-14:59' 54
'2017-12-01 15:00-15:59' 39
'2017-12-01 16:00-16:59' 29
'2017-12-01 17:00-17:00' 27
'2017-12-04 08:00-08:59' 49
'2017-12-04 09:00-09:59' 69
'2017-12-04 10:00-10:59' 27
'2017-12-04 11:00-11:59' 60
'2017-12-04 12:00-12:59' 23
'2017-12-04 13:00-13:59' 85
'2017-12-04 14:00-14:59' 14
'2017-12-04 15:00-15:59' 39
'2017-12-04 16:00-16:59' 59
'2017-12-04 17:00-17:00' 67",
header = TRUE, stringsAsFactors = FALSE)