将预测结果与R

时间:2016-02-26 21:03:45

标签: r for-loop dataframe forecasting

循环计算并在Dataframe中存储auto.arima()forecast()结果

可以使用以下

生成包含随机数据的我的数据帧的一小部分样本
df <- data.frame(col1 = runif(24, 400, 700),
                  col2 = runif(24, 350, 600),
                  col3 = runif(24, 600, 940),
                  col4 = runif(24, 2000, 2600),
                  col5 = runif(24, 950, 1200))


colnames(df) <- c("NorthHampton to EastHartford", "NorthHampton to Edison", 
                  "NorthHampton to Yonkers", "North Hampton to Brooklyn", "NorthHampton to Rotterdam" )

我正在尝试在R中使用ARIMA运行一系列auto.arima()模型,并且难以以所需格式生成输出。我开始的一个示例部分如下。

ts <- ts(df, frequency = 12, start = c(2014, 1), end = c(2015, 12))

model  <- list()
results <- list()


for (i in 1:ncol(ts)) {
  fit <- auto.arima(ts[,i], stepwise = F, approximation = F)
  model <- forecast(fit)$method
  results <- forecast(fit, h = 3)$mean

#   print(forecast(fit)$method)
#   print(forecast(fit, h=3)$mean)

  }

理想情况下,我希望我的循环填充data.frame,其格式如下:

Lane                                 Model                          Time    PointEstimate
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Jan-16                  
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Feb-16                  
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Mar-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Jan-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Feb-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Mar-16                  
Northampton to Yonkers          "ARIMA(0,0,0) with non-zero mean"   Jan-16                  

Lane的结果应与原始数据帧的列名相同。 Model的结果是forecast(fit)$method的结果,点估算应该是forecast(fit, h = 3)$mean的结果,其中每个项目都在dataframe {{1}中重复(3)在这种情况下。

我认为我的循环正在执行我需要的计算我只是无法弄清楚如何存储结果,然后将结果附加到循环结束的下一次迭代。我感谢任何帮助。

2 个答案:

答案 0 :(得分:2)

您可以尝试以下内容:

library(forecast)
fits <- lapply(1:ncol(ts),  function(i) auto.arima(ts[,i], stepwise = F, approximation = F))
models <- sapply(1:ncol(ts), function(i) forecast(fits[[i]])$method)
results <- lapply(1:ncol(ts), function(i) forecast(fits[[i]], h = 3)$mean)

resultsdf <- data.frame(do.call(rbind, results))
colnames(resultsdf) <- format(as.Date(time(results[[1]])), "%b-%y")
resultsdf$Lane=colnames(df)
resultsdf$Model=models

library(reshape2)
res <- melt(resultsdf, id.vars=4:5, measure.vars=1:3, variable;name = "Time",value;name = "PointEstimate")

                           Lane                           Model variable     value
1  NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean janv.-16  546.9441
2        NorthHampton to Edison ARIMA(0,0,0) with non-zero mean janv.-16  487.6225
3       NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean janv.-16  778.9514
4     North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean janv.-16 2459.3983
5     NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean janv.-16 1098.1912
6  NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean févr.-16  546.9441
7        NorthHampton to Edison ARIMA(0,0,0) with non-zero mean févr.-16  487.6225
8       NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean févr.-16  778.9514
9     North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean févr.-16 2416.4848
10    NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean févr.-16 1077.3921
11 NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean  mars-16  546.9441
12       NorthHampton to Edison ARIMA(0,0,0) with non-zero mean  mars-16  487.6225
13      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean  mars-16  778.9514
14    North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean  mars-16 2397.1000
15    NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean  mars-16 1085.3332

答案 1 :(得分:2)

了解如何整理data.frames和您使用的str项目。这是一个相对简单的练习。

library(forecast)
library(data.table)

combine_ts <- function(df, h=3, frequency= 12, start= c(2014,1), end=c(2015,12)) {
  results <- list()
  ts <- ts(df, frequency = frequency, start = start, end = end)

  for (i in 1:ncol(ts)) {
    fit <- auto.arima(ts[,i], stepwise = F, approximation = F)

    results[[i]] <- data.frame(Lane= rep(colnames(ts)[i], h),
                               Model= rep(forecast(fit)$method, h),
                               Date= format(as.Date(time(forecast(fit, h)$mean)), "%b-%y"),
                               PointEstimate= forecast(fit, h=h)$mean)

  }  
  return(data.table::rbindlist(results)) 
}

R> combine_ts(df)
                            Lane                           Model   Date PointEstimate
 1: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Jan-16      536.1760
 2: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Feb-16      536.1760
 3: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Mar-16      536.1760
 4:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Jan-16      488.9687
 5:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Feb-16      498.8986
 6:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Mar-16      502.4015
 7:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Jan-16      764.8654
 8:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Feb-16      764.8654
 9:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Mar-16      764.8654
10:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Jan-16     2304.5727
11:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Feb-16     2304.5727
12:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Mar-16     2304.5727
13:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Jan-16     1094.5927
14:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Feb-16     1094.5927
15:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Mar-16     1094.5927