我有一个数组:
Num Col2 Col3 Col4
1 6 1 1
2 60 0 2
3 60 0 1
4 6 0 1
5 60 1 1
代码:
y = df.loc[:,'Col3'] # response
X = df.loc[:,['Col2','Col4']] # predictor
X = sm.add_constant(X) #add constant
est = sm.OLS(y, X) #build regression
est = est.fit() #full model
当它到达.fit()时会引发一个错误:
Traceback (most recent call last):
File "D:\Users\Anna\workspace\mob1\mobols.py", line 36, in <module>
est = est.fit() #full model
File "C:\Python27\lib\site-packages\statsmodels\regression\linear_model.py", line 174, in fit
self.pinv_wexog, singular_values = pinv_extended(self.wexog)
File "C:\Python27\lib\site-packages\statsmodels\tools\tools.py", line 392, in pinv_extended
u, s, vt = np.linalg.svd(X, 0)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 1327, in svd
u, s, vt = gufunc(a, signature=signature, extobj=extobj)
File "C:\Python27\lib\site-packages\numpy\linalg\linalg.py", line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge")
numpy.linalg.linalg.LinAlgError: SVD did not converge
有什么问题?我该如何解决?
谢谢
答案 0 :(得分:2)
看起来你正在使用Pandas和statsmodels。我运行你的片段并没有得到LinAlgError(&#34; SVD没有收敛&#34;)&#39;例外。这就是我的目的:
import numpy as np
import pandas
import statsmodels.api as sm
d = {'col2': [6, 60, 60, 6, 60], 'col3': [1, 0, 0, 0, 1], 'col4': [1, 2, 1, 1, 1]}
df = pandas.DataFrame(data=d, index=np.arange(1, 6))
print df
打印:
col2 col3 col4
1 6 1 1
2 60 0 2
3 60 0 1
4 6 0 1
5 60 1 1
y = df.loc[:, 'col3']
X = df.loc[:, ['col2', 'col4']]
X = sm.add_constant(X)
est = sm.OLS(y, X)
est = est.fit()
print est.summary()
打印:
OLS Regression Results
==============================================================================
Dep. Variable: col3 R-squared: 0.167
Model: OLS Adj. R-squared: -0.667
Method: Least Squares F-statistic: 0.2000
Date: Sat, 28 Mar 2015 Prob (F-statistic): 0.833
Time: 16:43:02 Log-Likelihood: -3.0711
No. Observations: 5 AIC: 12.14
Df Residuals: 2 BIC: 10.97
Df Model: 2
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 1.0000 1.003 0.997 0.424 -3.316 5.316
col2 -8.674e-18 0.013 -6.62e-16 1.000 -0.056 0.056
col4 -0.5000 0.866 -0.577 0.622 -4.226 3.226
==============================================================================
Omnibus: nan Durbin-Watson: 1.500
Prob(Omnibus): nan Jarque-Bera (JB): 0.638
Skew: -0.000 Prob(JB): 0.727
Kurtosis: 1.250 Cond. No. 187.
==============================================================================
所以这似乎有效,所以没问题,至少使用这段代码。难道你是在用df调用错误的矩阵吗?