我正在比较R(3.3.1)预测包(7.3)和Python(3.5.2)statsmodels(0.8)之间的SARIMAX拟合结果。
R代码是:
library(forecast)
data("AirPassengers")
Arima(AirPassengers, order=c(2,1,1), seasonal=list(order=c(0,1,0),
period=12))$aic
[1] 1017.848
Python代码是:
from statsmodels.tsa.statespace import sarimax
import pandas as pd
AirlinePassengers =
pd.Series([112,118,132,129,121,135,148,148,136,119,104,118,115,126,
141,135,125,149,170,170,158,133,114,140,145,150,178,163,
172,178,199,199,184,162,146,166,171,180,193,181,183,218,
230,242,209,191,172,194,196,196,236,235,229,243,264,272,
237,211,180,201,204,188,235,227,234,264,302,293,259,229,
203,229,242,233,267,269,270,315,364,347,312,274,237,278,
284,277,317,313,318,374,413,405,355,306,271,306,315,301,
356,348,355,422,465,467,404,347,305,336,340,318,362,348,
363,435,491,505,404,359,310,337,360,342,406,396,420,472,
548,559,463,407,362,405,417,391,419,461,472,535,622,606,
508,461,390,432])
AirlinePassengers.index = pd.DatetimeIndex(end='1960-12-31',
periods=len(AirlinePassengers), freq='1M')
print(sarimax.SARIMAX(AirlinePassengers,order=(2,1,1),
seasonal_order=(0,1,0,12)).fit().aic)
抛出错误:ValueError:将enforce_stationarity
设置为True时发现的非静态启动自回归参数。
如果我将enforce_stationarity(和enforce_invertibility,也是必需的)设置为False,则模型拟合起作用但AIC非常差(> 1400)。
对相同数据使用一些其他模型参数,例如ARIMA(0,1,1)(0,0,1)[12]我可以从R和Python获得相同的结果,并在Python中启用平稳性和可逆性检查。
我的主要问题是:什么解释了某些模型参数的行为差异?是statsmodels'可逆性检查与预测的Arima不同,并且是另一种方式"更正确"?
我还发现了与修复statsmodel中的可逆性计算错误相关的拉取请求:https://github.com/statsmodels/statsmodels/pull/3506
使用Github的最新源代码重新安装statsmodel后,我仍然得到与上面代码相同的错误,但是设置enforce_stationarity = False和enforce_invertibility = False我得到大约1010的aic,低于R的情况。但模型参数也大不相同。