为什么OLS引发LinAlgError:SVD没有收敛?

时间:2015-03-27 18:24:43

标签: python

我有一个数组:

Num Col2 Col3  Col4  
1   6     1     1   
2   60    0     2   
3   60    0     1   
4   6     0     1   
5   60    1     1   

代码:

y = df.loc[:,'Col3']  # response
X = df.loc[:,['Col2','Col4']]  # predictor
X = sm.add_constant(X) #add constant
est = sm.OLS(y, X) #build regression
est = est.fit() #full model

当它到达.fit()时会引发一个错误:

Traceback (most recent call last):
File "D:\Users\Anna\workspace\mob1\mobols.py", line 36, in <module>
est = est.fit() #full model
File "C:\Python27\lib\site-packages\statsmodels\regression\linear_model.py", line 174, in fit
self.pinv_wexog, singular_values = pinv_extended(self.wexog)
File "C:\Python27\lib\site-packages\statsmodels\tools\tools.py", line 392, in pinv_extended
u, s, vt = np.linalg.svd(X, 0)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 1327, in svd
u, s, vt = gufunc(a, signature=signature, extobj=extobj)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge")
numpy.linalg.linalg.LinAlgError: SVD did not converge

有什么问题?我该如何解决?

谢谢

1 个答案:

答案 0 :(得分:2)

看起来你正在使用Pandas和statsmodels。我运行你的片段并没有得到LinAlgError(&#34; SVD没有收敛&#34;)&#39;例外。这就是我的目的:

import numpy as np
import pandas
import statsmodels.api as sm
d = {'col2': [6, 60, 60, 6, 60], 'col3': [1, 0, 0, 0, 1], 'col4': [1, 2, 1, 1, 1]}
df = pandas.DataFrame(data=d, index=np.arange(1, 6))
print df

打印:

   col2  col3  col4
1     6     1     1
2    60     0     2
3    60     0     1
4     6     0     1
5    60     1     1

y = df.loc[:, 'col3']
X = df.loc[:, ['col2', 'col4']]
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
print est.summary()

打印:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                   col3   R-squared:                       0.167
Model:                            OLS   Adj. R-squared:                 -0.667
Method:                 Least Squares   F-statistic:                    0.2000
Date:                Sat, 28 Mar 2015   Prob (F-statistic):              0.833
Time:                        16:43:02   Log-Likelihood:                -3.0711
No. Observations:                   5   AIC:                             12.14
Df Residuals:                       2   BIC:                             10.97
Df Model:                           2                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          1.0000      1.003      0.997      0.424        -3.316     5.316
col2       -8.674e-18      0.013  -6.62e-16      1.000        -0.056     0.056
col4          -0.5000      0.866     -0.577      0.622        -4.226     3.226
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.500
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.638
Skew:                          -0.000   Prob(JB):                        0.727
Kurtosis:                       1.250   Cond. No.                         187.
==============================================================================

所以这似乎有效,所以没问题,至少使用这段代码。难道你是在用df调用错误的矩阵吗?