Question

我正在尝试增加我的初始数组的维度：

import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
x = 10*rng.rand(50)
y = np.sin(x) + 0.1*rng.rand(50)

poly = PolynomialFeatures(7, include_bias=False)
poly.fit_transform(x[:,np.newaxis])

首先，我知道np.newaxis正在创建其他列。为什么这有必要？

现在，我将使用线性回归训练更新的x数据（多边形）

test_x = np.linspace(0,10,1000)
from sklearn.linear_model import LinearRegression

model = LinearRegression()
# train with increased dimension(x=poly) with its target
model.fit(poly,y)
# testing
test_y = model.predict(x_test)

当我运行它时，它会给我：ValueError：预期的2D数组，而改为标量数组：在model.fit（poly，y）行上。我已经为poly添加了尺寸，这是怎么回事？

x [：，np.newaxis]与Vs有什么区别？ x [：，None]？

Answer 1

In [55]: x=10*np.random.rand(5)                                                 
In [56]: x                                                                      
Out[56]: array([6.47634068, 6.25520837, 7.58822106, 4.65466951, 2.35783624])
In [57]: x.shape                                                                
Out[57]: (5,)

newaxis不添加列，而是添加维：

In [58]: x1 = x[:,np.newaxis]                                                   
In [59]: x1                                                                     
Out[59]: 
array([[6.47634068],
       [6.25520837],
       [7.58822106],
       [4.65466951],
       [2.35783624]])
In [60]: x1.shape                                                               
Out[60]: (5, 1)

np.newaxis的值为None，因此两者的工作原理相同。

In[61]: x[:,None].shape                                                        
Out[61]: (5, 1)

一个对于人类读者来说更清晰一些，另一个对于打字来说更容易一些。 https://www.numpy.org/devdocs/reference/constants.html

x或x1是否有效取决于学习代码的期望。一些学习代码期望输入形式为(samples, features)。可以假设一个（50，）形状数组是50个样本，1个特征或1个案例，50个特征。但是最好能确切地说出您的意思。

看一下文档：

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html#sklearn.preprocessing.PolynomialFeatures.fit_transform

poly.fit_transform
X : numpy array of shape [n_samples, n_features]

肯定看起来fit_transform希望输入2d。

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression.fit

X和y都应该是2d。

numpy.newaxis在机器学习中的使用

1 个答案: