Question

我想使用Linear Regression中的Poly Features和sklearn来预测100个数据点的y值，即np.linspace(0, 10, 100)。

数据：

n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

好的，到目前为止，我所做的实际上与普通的Linear Regression可以正常工作，但是当我尝试使用Polynomial Features尝试一些新模型时，效果并不理想。

这很好：

pre_result = []
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
linreg = LinearRegression().fit(X_train.reshape(-1, 1), y_train)
pre_result.append(linreg.predict(np.linspace(0, 10, 100).reshape(-1, 1)))

这会产生错误：

third = PolynomialFeatures(degree=3)
X_third = third.fit_transform(x.reshape(-1, 1))
X_train, X_test, y_train, y_test = train_test_split(X_third, y, random_state=0)
polyreg = LinearRegression().fit(X_train, y_train)
pre_result.append(polyreg.predict(np.linspace(0, 10, 100).reshape(-1, 1)))

错误：

ValueError: shapes (100,1) and (4,) not aligned: 1 (dim 1) != 4 (dim 0)

如果我使用PolynomialFeatures(degree=6)而不是degree=3，它将显示ValueError: shapes (100,1) and (7,) not aligned: 1 (dim 1) != 7 (dim 0)。这完全使我感到困惑。

尽管如此，以下示例仍可以正常运行：

X_F1, y_F1 = make_friedman1(n_samples = 100,
                       n_features = 7, random_state=0)
poly = PolynomialFeatures(degree=2)
X_F1_poly = poly.fit_transform(X_F1)

X_train, X_test, y_train, y_test = train_test_split(X_F1_poly, y_F1,
                                               random_state = 0)
linreg = LinearRegression().fit(X_train, y_train)
predict = linreg.predict(X_test)

感谢有人对此提供任何见解。提前致谢。

Answer 1

问题出在代码段的最后一行。在这里

polyreg.predict(np.linspace(0, 10, 100).reshape(-1, 1))

传递x变量的一些“原始”输入-形状为(100,1)的东西。这意味着您有100个观测值，每个观测值都有一个变量。

您的模型（polyreg）希望每个观测值都有4个变量（3次多项式的系数）。

解决方案1

您可以这样做：

polyreg.predict( third.transform( np.linspace(0, 10, 100).reshape(-1, 1) ))

可能是更好的解决方案：

但是documentation中使用this example

的方法可能更好

poly3_model = Pipeline(
    [('poly', PolynomialFeatures(degree=3)), 
     ('linear', LinearRegression())])

X_train, X_test, y_train, y_test = train_test_split(x.reshape(-1, 1), y.reshape(-1, 1), random_state=0)

poly3_model.fit( X_train, y_train )
test_prediction = poly3_model.predict( X_test )
another_prediction = poly3_model.predict( np.linspace(0, 10, 100).reshape(-1, 1))

Sklearn：具有Polyfeature的线性回归会产生未对齐的形状

1 个答案:

解决方案1

可能是更好的解决方案：

Sklearn：具有Polyfeature的线性回归会产生未对齐的形状

1 个答案:

解决方案1 ​​

可能是更好的解决方案：

解决方案1