统计模型Logit()。fit()函数抛出错误LinAlgError:奇异矩阵

时间:2018-02-03 21:44:37

标签: python python-3.x statistics statsmodels

我正在尝试在数据集上创建运行logit模型,其中mpg_high是基于其他数据框列的结果变量。

当我运行以下代码时,我没有收到任何错误:

exog = ['constant','cylinders','displacement','horsepower','weight','year', 'origin']

logit = sm.Logit(endog = df_quant2['mpg_high'], exog = df_quant2[['constant','cylinders','displacement','horsepower','weight','year', 'origin']])

但是当我尝试使用

logit.fit()

我收到以下错误:

Warning: Maximum number of iterations has been exceeded.
         Current function value: inf
         Iterations: 35


/Users/*/anaconda/lib/python3.6/site-packages/statsmodels/discrete/discrete_model.py:1214: RuntimeWarning: overflow encountered in exp
  return 1/(1+np.exp(-X))
/Users/*/anaconda/lib/python3.6/site-packages/statsmodels/discrete/discrete_model.py:1264: RuntimeWarning: divide by zero encountered in log
  return np.sum(np.log(self.cdf(q*np.dot(X,params))))
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-112-f8dd482d7884> in <module>()
----> 1 logit.fit()

/Users/*/anaconda/lib/python3.6/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
   1375         bnryfit = super(Logit, self).fit(start_params=start_params,
   1376                 method=method, maxiter=maxiter, full_output=full_output,
-> 1377                 disp=disp, callback=callback, **kwargs)
   1378 
   1379         discretefit = LogitResults(self, bnryfit)

/Users/*/anaconda/lib/python3.6/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
    202         mlefit = super(DiscreteModel, self).fit(start_params=start_params,
    203                 method=method, maxiter=maxiter, full_output=full_output,
--> 204                 disp=disp, callback=callback, **kwargs)
    205 
    206         return mlefit # up to subclasses to wrap results

/Users/*/anaconda/lib/python3.6/site-packages/statsmodels/base/model.py in fit(self, start_params, method, maxiter, full_output, disp, fargs, callback, retall, skip_hessian, **kwargs)
    456             Hinv = cov_params_func(self, xopt, retvals)
    457         elif method == 'newton' and full_output:
--> 458             Hinv = np.linalg.inv(-retvals['Hessian']) / nobs
    459         elif not skip_hessian:
    460             H = -1 * self.hessian(xopt)

/Users/*/anaconda/lib/python3.6/site-packages/numpy/linalg/linalg.py in inv(a)
    511     signature = 'D->D' if isComplexType(t) else 'd->d'
    512     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 513     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
    514     return wrap(ainv.astype(result_t, copy=False))
    515 

/Users/*/anaconda/lib/python3.6/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
     88 
     89 def _raise_linalgerror_singular(err, flag):
---> 90     raise LinAlgError("Singular matrix")
     91 
     92 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix

mpg_high值中的所有值都是0或1

不确定我在这里缺少什么,感谢任何帮助!

谢谢!

1 个答案:

答案 0 :(得分:0)

<强> 更新

我能够通过从endog参数中排除Horsepower变量来获得一些模型。这可能是由于数据类型。我已经将它转换为float64,但模型仍然无法使用现在更改的列数据类型

运行