当我使用sklearn进行线性回归时,如下所示:
x = df3.iloc[:,3:28].as_matrix().astype(int)
y = df3.iloc[:,0].as_matrix().astype(int)
from sklearn import linear_model
clf = linear_model.LinearRegression()
clf.fit(x,y)
print clf.coef_
print clf.score(x,y)
它成功运行了,
[-55.06808051 -0.41350797 0.87904675 8.45228978 -3.54825048
-8.63347841 22.26301155 20.82488927 -8.96890439 -8.67371986
-6.99648938 0.07511078 -11.35819826 -7.51583817 -9.34964411
10.41088005 12.30815737 8.55961149 14.69859856 9.81675829
13.2959571 10.92224962 3.62143586 12.07015579 5.35530774]
0.455478830291
但如果我改变如下:
x = df3.iloc[:,3:29].as_matrix().astype(int)
y = df3.iloc[:,0].as_matrix().astype(int)
from sklearn import linear_model
clf = linear_model.LinearRegression()
clf.fit(x,y)
print clf.coef_
print clf.score(x,y)
结果不现实:
[ -5.50139971e+01 8.30093647e+12 8.30093647e+12 8.30093622e+12
8.30093635e+12 8.30093635e+12 8.30093635e+12 8.30093635e+12
-1.09960450e+14 -1.09960448e+14 -1.09960450e+14 -1.09960448e+14
-1.09960450e+14 -1.09960448e+14 -1.09960450e+14 -1.09960450e+14
-1.09960444e+14 -1.09960450e+14 -1.09960448e+14 -1.09960450e+14
1.50019494e+01 1.24627093e+01 5.25730696e+00 1.39210971e+01
7.13531424e+00 8.90822183e+00]
-5841202768.85
为什么呢?自变量的数量是否有限制?
注意:x已更改。