线性回归返回与合成参数不同的结果

时间:2016-03-21 12:28:05

标签: python scikit-learn linear-regression

尝试此代码:

from sklearn import linear_model
import numpy as np

x1 = np.arange(0,10,0.1)
x2 = x1*10

y = 2*x1 + 3*x2
X = np.vstack((x1, x2)).transpose()

reg_model = linear_model.LinearRegression()
reg_model.fit(X,y)

print reg_model.coef_
# should be [2,3]

print reg_model.predict([5,6])
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_
# perfectly at the expected value of 0

print reg_model.score(X,y)
# seems to be rather confident to be right

结果

  • [0.31683168 3.16831683]
  • 20.5940594059
  • 0.0
  • 1.0

因此不是我所期望的 - 它们与用于合成数据的参数不同。为什么会这样?

1 个答案:

答案 0 :(得分:0)

你的问题在于解决方案的独特性,因为两个维度是相同的(将线性变换应用于一个维度并不会在此模型的眼中产生唯一数据),您将获得无数个可能的解决方案你的数据。将非线性变换应用于第二维,您将看到所需的输出。

from sklearn import linear_model
import numpy as np

x1 = np.arange(0,10,0.1)
x2 = x1**2
X = np.vstack((x1, x2)).transpose()
y = 2*x1 + 3*x2

reg_model = linear_model.LinearRegression()
reg_model.fit(X,y)
print reg_model.coef_
# should be [2,3]

print reg_model.predict([[5,6]])
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_
# perfectly at the expected value of 0

print reg_model.score(X,y)

输出

  • [ 2. 3.]
  • [ 28.]
  • -2.84217094304e-14
  • 1.0