Question

基于scikit-learn文档http://scikit-learn.org/stable/auto_examples/svm/plot_iris.html#sphx-glr-auto-examples-svm-plot-iris-py。我尝试绘制分类器的决策边界，但它发送错误消息调用“ValueError：X每个样本有2个特征;期望908430”这个代码“Z = clf.predict（np.c_ [xx.ravel（），yy.ravel（）]）“

clf = SGDClassifier().fit(step2, index)  
X=step2
y=index
h = .02
colors = "bry"
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                 np.arange(y_min, y_max, h))

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
plt.axis('off')

# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)

'index'是一个标签，其中包含评论的[98579 X 1]标签，包括正面，自然和负面

array(['N', 'N', 'P', ..., 'NEU', 'P', 'N'], dtype=object)

'step2'是[98579 X 908430] numpy矩阵，它由Countvectorizer函数形成，它与注释数据有关

<98579x908430 sparse matrix of type '<type 'numpy.float64'>'
with 3168845 stored elements in Compressed Sparse Row format>

Answer 1

对于非 2维的数据，您不能绘制分类器的决策边界。您的数据显然是高维度的，它有908430维度（NLP任务我假设）。没有办法为这样的模型绘制实际决策边界。您正在使用的示例是在 2D数据（减少的光圈）上训练的，这是唯一的原因为什么他们能够绘制它。

绘制分类器的决策边界，ValueError：X每个样本有2个特征;期待908430“

1 个答案: