Question

我正在研究预测模型，我从2014年到当月（2018年3月）有月度数据。

我的部分数据是用于计费的列和用于报价金额的列，例如（格式化道歉）

年 - 季 - 月 - 帐单 - 报价
2014- 2014Q1-- 201401- 100 ------------- 500
2014- 2014Q1-- 201402-150 ------------- 600
2014- 2014Q1-- 201403- 200 ------------- 700

我用它来预测月度销售额，并尝试每月使用xreg和报价数量。

我查看了下面的文章，但是我错过了一些可以完成我尝试做的事： ARIMA forecasting with auto.Arima() and xreg

问题：有人可以展示使用xreg预测OUT OF SAMPLE的示例吗？我知道为了实现这一点，你需要从样本中预测你的xreg变量，但我无法弄清楚如何传递这些未来的值。

我在预测值后尝试使用类似futurevalues $ mean的东西，但这不起作用。

这是我的代码：

sales = read.csv('sales.csv')

# Below, I'm creating a training set for the models through 
#  December 2017 (48 months).
train = sales[sales$TRX_MON<=201712,]

# I will also create a test set for our data from January 2018 (3 months)
test = sales[sales$TRX_MON>201712,]

dtstr2 <- ts(train2, start=2014, frequency=12)
dtste2 <- ts(test2, start=2018, frequency=12)

fit2 <- auto.arima(dtstr2[,"BILLINGS"], xreg=dtstr2[,"QUOTES"])
fcast2 <- forecast(fit2, xreg=dtste2[,"QUOTES"], h=24)
fcast2

上面的代码有效，但仅提供3个月的预测，例如

                  Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Jan 2018          70                60       100      50       130
Feb 2018          80                70       110      60       140
Mar 2018          90                80       120      70       150

我已经搜索了很多博客和主题，我可以找到一个使用auto.arima的示例，其中包含xreg变量的样本预测，并且找不到任何已完成此操作的内容。

有人可以帮忙吗？

非常感谢。

Answer 1

这是一个MWE，用于预测具有未知协变量的时间序列的样本。这取决于提供的数据for this question 以及@Raad的优秀答案。

library("forecast")

dta = read.csv("~/stackexchange/data/xdata.csv")[1:96,]
dta <- ts(dta, start = 1)

# to illustrate out of sample forecasting with covariates lets split the data
train <- window(dta, end = 90)
test <- window(dta, start = 91)

# fit model
covariates <- c("Customers", "Open", "Promo")
fit <- auto.arima(train[,"Sales"], xreg = train[, covariates])

从测试数据预测

fcast <- forecast(fit, xreg = test[, covariates])

但是如果我们还不知道客户的价值呢？期望的目标是预测客户，然后使用这些预测销售预测中的值。 Open和Promo受到控制经理，所以将被修复＆＃34;在预测中。

customerfit <- auto.arima(train[,"Customers"], xreg = train[, c("Open","Promo")])

我会尝试预测2周后，并假设没有促销。

newdata <- data.frame(Open = rep(c(1,1,1,1,1,1,0), times = 2),
                          Promo = 0)

customer_fcast <- forecast(customerfit, xreg = newdata)

# the values of customer are in `customer_fcast$mean`

newdata$Customers <- as.vector(customer_fcast$mean)

以与原始数据相同的顺序获取newdata列至关重要！ forecast()通过位置

匹配回归量

sales_fcast <- forecast(fit, xreg = as.matrix(newdata)[,c(3,1,2)])
plot(sales_fcast)

由reprex package（v0.2.0）创建于2018-03-29。

Answer 2

再次感谢您协助解决此问题。

我能够结合使用上述建议来获得我想要的东西。

最终，我最终做的是为我的外生变量创建时间序列对象并预测它们。然后，我获取了预测的$ mean输出并为那些（我想要预测我的原始变量的长度）创建了时间序列对象，然后将它们输入到我原来的预测模型中。

使用auto.arima（）和xreg进行样本预测

2 个答案: