当我尝试通过statsmodels
在Python中使用VAR时间序列模型拟合一些数据时收到错误,其文档可用here:
我所拥有的数据位于数据框df_IBM_training
中,如下所示:
date sym open high low close newscount
6 2014.08.05 IBM 189.30 189.3000 186.4100 187.0800 4
9 2014.08.06 IBM 185.80 186.8800 184.4400 185.9000 0
12 2014.08.07 IBM 186.56 186.8800 1.0000 184.2800 2
15 2014.08.08 IBM 183.32 186.6800 183.3200 186.5499 18
我想构建的模型VAR看起来像这样,我尝试在下面的代码中创建的回归量。我还尝试在下面的代码中搜索理想的模型顺序,这是我得到错误的地方。下面等式中的每个系数,例子包括α1,1,γ1,11与代码中的regressor
相关联:
Δlog(C_t) = α1,1(log(C_t - 1) − log(O_t-1))
+ α1,2(log(C_t - 1) − log(H_t-1))
+ α1,3(log(C_t - 1) − log(L_t-1))
+ γ1,11Δlog(C_t − 1)
+ γ1,12Δlog(O_t − 1)
+ γ1,13Δlog(H_t − 1)
+ γ1,14Δlog(L_t − 1)
+ εt
我的代码如下。出于某种原因,我在行model.select_order(8)
中收到以下错误:
numpy.linalg.linalg.linalgerror第7个领先的未成年人半正定
#VAR regressors
df_IBM_training['log_ret0'] = np.log(df_IBM_training.close) - np.log(df_IBM_training.close.shift(1))
df_IBM_training['log_ret1'] = np.log(df_IBM_training.open) - np.log(df_IBM_training.open.shift(1))
df_IBM_training['log_ret2'] = np.log(df_IBM_training.high) - np.log(df_IBM_training.high.shift(1))
df_IBM_training['log_ret3'] = np.log(df_IBM_training.low) - np.log(df_IBM_training.low.shift(1))
df_IBM_training = df_IBM_training[np.isfinite(df_IBM_training['log_ret3'])]
regressor_1 = np.log(df_IBM_training['close']) - np.log(df_IBM_training['open'])
regressor_2 = np.log(df_IBM_training['close']) - np.log(df_IBM_training['high'])
regressor_3 = np.log(df_IBM_training['close']) - np.log(df_IBM_training['low'])
regressor_4 = df_IBM_training['log_ret0']
regressor_5 = df_IBM_training['log_ret1']
regressor_6 = df_IBM_training['log_ret2']
regressor_7 = df_IBM_training['log_ret3']
X_IBM = [regressor_1, regressor_2, regressor_3, regressor_4,regressor_5, regressor_6, regressor_7]
X_IBM = np.array(X_IBM)
X_IBM = X_IBM.T
model = statsmodels.tsa.api.VAR(X_IBM)
#The line below is where the error arises
model.select_order(8)
修改:以下跟踪错误:
Traceback (most recent call last):
File "TimeSeries.py", line 70, in <module>
model.select_order(8)
File "C:\Python34\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 505, in select_order
for k, v in iteritems(result.info_criteria):
File "C:\Python34\lib\site-packages\statsmodels\base\wrapper.py", line 35, in __getattribute__
obj = getattr(results, attr)
File "C:\Python34\lib\site-packages\statsmodels\tools\decorators.py", line 94, in __get__
_cachedval = self.fget(obj)
File "C:\Python34\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 1468, in info_criteria
ld = logdet_symm(self.sigma_u_mle)
File "C:\Python34\lib\site-packages\statsmodels\tools\linalg.py", line 213, in logdet_symm
c, _ = linalg.cho_factor(m, lower=True)
File "C:\Python34\lib\site-packages\scipy\linalg\decomp_cholesky.py", line 132, in cho_factor
check_finite=check_finite)
File "C:\Python34\lib\site-packages\scipy\linalg\decomp_cholesky.py", line 30, in _cholesky
raise LinAlgError("%d-th leading minor not positive definite" % info)
numpy.linalg.linalg.LinAlgError: 5-th leading minor not positive definite