R - ARIMA,TBATS,UCM,贝叶斯结构时间序列等预测

时间:2016-08-30 21:19:47

标签: r forecasting

我是2个月大的预测概念,但我正在努力学习并继续练习。在这里,我试图预测每周产品 使用不同的预测技术对训练数据集进行移动并测试其在测试数据集上的准确性。我尝试了不同的技术,如ARIMA,TBATS,Holts Winter,UCM,贝叶斯结构时间序列等。但是无法提高我的准确性。准确性似乎非常糟糕。不知道我哪里错了。我还尝试了ARIMA和回归量,但它再次对我没有帮助。我不确定我的代码或我的方法是否错误。任何人都可以指导我提高我的准确性吗?以下是每周数据集(从2012年12月8日开始)

  [1]  74  76  78  63  58  58  57  56  85  73  71  91  85
 [14]  79 101  74  86  98 131  90 127 116 320 145 121 148
 [27] 112 141 153 118 151 151 152  90 147 123 266  99 110
 [40] 146 134  76  81 100  80 323  15  22  14  13  19  56
 [53]  78  79  70  79  24  26  31  35  45  33  41  41  61
 [66]  91  83  76  57  68  87  82 105  76 107 116 105 124
 [79] 127 149 124 120 111 122 134  87  80  81  89  40  63
 [92] 112  85 131  97  51  65  74  70  47  62  60  49  47
[105]  56  64  57  58  45  56  60  49  82  49  61  71  61
[118]  92  90  75  69 114  79 144 121 133 132 114 124 152
[131] 125 112 128 124 152  95  64  59  91 132 146 120 196
[144] 212 115 125  66  68  78  83  74 300  46  98  86  95
[157]  61  73  89  56  81  60  58 101 482  55 124  72  57
[170]  51  82  55  68 105 153 113 105  85  34  77  95  96
[183]  97  94  81 104  76  97  65  42  18  11

我认为我的训练期为178周,测试为14周。 可以说,'数据'是我的数据框"单位"作为我的名字,

series    <- ts(data, start=2012+342/365.25, frequency = 365.25/7)
kk        <- 178
seas      <- 365.25/7
st        <- tsp(series)[1] + (1/seas)*(kk-1)
training  <- window(series, end = st)
testing   <- window(series, start = st + 1/52.17857, end = st+14/52.17857)

train1 <- training[,"units"]
test1  <- testing[,"units"]

##ARIMA
farima    <- forecast(auto.arima(train1),h=14)
acc_arima <- accuracy(farima$mean,test1)

##TBATS
fTBATS    <- forecast(tbats(train1,seasonal.periods=c(4,7,12,52)), h=14)
acc_TBATS <- accuracy(fTBATS$mean,test1)

##struTs
fstruTs    <- forecast(StructTS(train1), h=14)
acc_struTs <- accuracy(fstruTs$mean,test1)

##UCM
forUCM     <- ucm(formula = train1~0, data = train1, level =     
TRUE, slope = TRUE)
fUCM       <- predict(forUCM$model, n.ahead = 14)
acc_struTs <- accuracy(fUCM$fit,test1)

##Bayesian Structural time series
ss <- AddLocalLinearTrend(list(), train1)
ss <- AddSeasonal(ss, train1, nseasons = 52, season.duration = 7)

model2 <- bsts(train1, state.specification = ss, niter = 500)
fbsts <- predict(model2, horizon = 14, burn = 100)
acc_bsts <- accuracy(fbsts$mean,test1)

对于上述所有方法,我的MAPE高于100%,我认为非常糟糕。有人可以指导我提高准确性吗?我将非常感激。 谢谢!

1 个答案:

答案 0 :(得分:1)

I would recommend a few things:

1) If you are using the excellent R forecast package, I would recommend at least trying the fully automated forecast (see examples below).

2) I would recommend plotting the forecast and actual values, along with the historic data to see if the output seems reasonable given the historic data.

3) I would recommend reading the free on-line textbook made by some of the creators of the R forecast package.

The example below uses the fully automated time series forecast from the forecast package and plots the results, both for the data-set you're using above, and another publicly available data-set.

library(ggplot2)
library(forecast)

data <- read.table("./data.txt", quote="\"", comment.char="")
series <- ts(as.numeric(data), start=2012+342/365.25, frequency = 365.25/7)

train_length <- 178
test_length <- length(series) - train_length
train_end <- time(series)[train_length]
test_start <- time(series)[train_length+1]

training <- window(series, end = train_end)
testing <- window(series, start = test_start)

## Use default forecast
fcast <- forecast(training, h=test_length)
plot(fcast)
lines(testing, col='red')
acc_fcast <- accuracy(fcast$mean, testing)

births <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")
birthstimeseries <- ts(births, frequency=12, start=c(1946,1))
train_length <- 150
test_length <- length(birthstimeseries) - train_length
train_end <- time(birthstimeseries)[train_length]
test_start <- time(birthstimeseries)[train_length+1]

training <- window(birthstimeseries, end = train_end)
testing <- window(birthstimeseries, start = test_start)

## Use default forecast
fcast <- forecast(training, h=test_length)
plot(fcast)
lines(testing, col='red')
acc_births <- accuracy(fcast$mean, testing)