Question

尝试此代码：

from sklearn import linear_model
import numpy as np

x1 = np.arange(0,10,0.1)
x2 = x1*10

y = 2*x1 + 3*x2
X = np.vstack((x1, x2)).transpose()

reg_model = linear_model.LinearRegression()
reg_model.fit(X,y)

print reg_model.coef_
# should be [2,3]

print reg_model.predict([5,6])
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_
# perfectly at the expected value of 0

print reg_model.score(X,y)
# seems to be rather confident to be right

结果

[0.31683168 3.16831683]
20.5940594059
0.0
1.0

因此不是我所期望的 - 它们与用于合成数据的参数不同。为什么会这样？

Answer 1

你的问题在于解决方案的独特性，因为两个维度是相同的（将线性变换应用于一个维度并不会在此模型的眼中产生唯一数据），您将获得无数个可能的解决方案你的数据。将非线性变换应用于第二维，您将看到所需的输出。

from sklearn import linear_model
import numpy as np

x1 = np.arange(0,10,0.1)
x2 = x1**2
X = np.vstack((x1, x2)).transpose()
y = 2*x1 + 3*x2

reg_model = linear_model.LinearRegression()
reg_model.fit(X,y)
print reg_model.coef_
# should be [2,3]

print reg_model.predict([[5,6]])
# should be 2*5 + 3*6 = 28 

print reg_model.intercept_
# perfectly at the expected value of 0

print reg_model.score(X,y)

输出

[ 2. 3.]
[ 28.]
-2.84217094304e-14
1.0

线性回归返回与合成参数不同的结果

1 个答案: