Arima和HoltWinters用于数据集中的所有变量

时间:2017-11-03 08:32:41

标签: r arima

在我的数据集中,我有很多变量,对于每个变量,我都想运行预测;这是数据集的一部分:

Market_82   Market_83   Market_84   Market_85   Total   YEAR_   MONTH_  DATE_
14481   7000    5649    6818    536413  1999    1   JAN 1999
15162   7272    5750    6943    558797  1999    2   FEB 1999
15961   7668    5901    7130    582077  1999    3   MAR 1999
16933   7869    5944    7333    605332  1999    4   APR 1999
17758   8057    6009    7637    630019  1999    5   MAY 1999
18266   8428    6177    7930    654694  1999    6   JUN 1999
19058   8587    6313    8145    678877  1999    7   JUL 1999
19881   8823    6430    8270    702958  1999    8   AUG 1999
20996   8922    6718    8363    727667  1999    9   SEP 1999
21851   9178    6908    8596    752467  1999    10  OCT 1999
22681   9306    7011    8777    776867  1999    11  NOV 1999
23769   9439    7264    8914    801741  1999    12  DEC 1999


model = arima(dataset, order=c(1,1,1))
fcast <- forecast(model, h2)

我想我需要编写一个循环来对所有变量执行此分析,但我是一个新手并且不知道如何正确地编写循环。

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

最好的办法是创建一个函数并将其应用于所有相关列,即

my_forecast <- function(x){
  model <- arima(x, order = c(1, 1, 1))
  fcast <- forecast(model, 2)
  return(fcast)
}

#applying it as follows

lapply(d2[1:3], my_forecast)

给出,

$Market_82
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
13       9606.480 9477.761 9735.200 9409.621  9803.34
14       9772.448 9562.007 9982.888 9450.606 10094.29

$Market_83
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
13       7491.812 7370.151 7613.472 7305.749 7677.875
14       7602.992 7300.631 7905.354 7140.570 8065.415

$Market_84
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
13       9032.648 8942.724 9122.571 8895.122 9170.174
14       9137.515 8931.943 9343.087 8823.120 9451.910

数据

dput(d2)
structure(list(Market_82 = c(7000L, 7272L, 7668L, 7869L, 8057L, 
8428L, 8587L, 8823L, 8922L, 9178L, 9306L, 9439L), Market_83 = c(5649L, 
5750L, 5901L, 5944L, 6009L, 6177L, 6313L, 6430L, 6718L, 6908L, 
7011L, 7264L), Market_84 = c(6818L, 6943L, 7130L, 7333L, 7637L, 
7930L, 8145L, 8270L, 8363L, 8596L, 8777L, 8914L), Market_85 = c(536413L, 
558797L, 582077L, 605332L, 630019L, 654694L, 678877L, 702958L, 
727667L, 752467L, 776867L, 801741L), Total = c(1999L, 1999L, 
1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 
1999L), YEAR_ = 1:12, MONTH_ = structure(c(5L, 4L, 8L, 1L, 9L, 
7L, 6L, 2L, 12L, 11L, 10L, 3L), .Label = c("APR", "AUG", "DEC", 
"FEB", "JAN", "JUL", "JUN", "MAR", "MAY", "NOV", "OCT", "SEP"
), class = "factor"), DATE_ = c(1999L, 1999L, 1999L, 1999L, 1999L, 
1999L, 1999L, 1999L, 1999L, 1999L, 1999L, 1999L)), .Names = c("Market_82", 
"Market_83", "Market_84", "Market_85", "Total", "YEAR_", "MONTH_", 
"DATE_"), class = "data.frame", row.names = c("14481", "15162", 
"15961", "16933", "17758", "18266", "19058", "19881", "20996", 
"21851", "22681", "23769"))

注意我遗漏了Market_85,因为其自动回归系数似乎是non-stationary