绘制决策边界,结果和训练点

时间:2020-11-05 15:02:00

标签: python machine-learning plot scikit-learn classification

我正在使用scikit-learn中已经存在的wine数据集进行分类问题。我想复制scikit-learn指南中的虹膜数据集的代码,以应用于我的葡萄酒数据集。
我或多或少已经完成了我的工作。
但是,现在我必须绘制结果,决策边界以及训练和测试点。
在此链接中,您可以完成所有工作。
Click
例如,我建立的第一个模型是KNearestNeighbors。现在,我想在此链接上进行类似于scikit学习指南的工作: Click
因此,我的代码用于构建分类器:

n5 = KNeighborsClassifier() #trial with 5 as number of neighbors; 5 is n_neighbors of default
n5.fit(X_train, Y_train) 
#Now that our model is instantiated and fitted to our training data, let's dive right in with making some predictions. 
pred_n5 = n5.predict(X_test)
print(n5.score(X_test, Y_test))
print(n5.predict(X_test))
print(classification_report(Y_test, pred_n5))
n_neighbors = 15

# import some data to play with
iris = datasets.load_iris()

# we only take the first two features. We could avoid this ugly
# slicing by using a two-dim dataset
X = iris.data[:, :2]
y = iris.target

h = .02  # step size in the mesh

# Create color maps
cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
cmap_bold = ListedColormap(['darkorange', 'c', 'darkblue'])

for weights in ['uniform', 'distance']:
    # we create an instance of Neighbours Classifier and fit the data.
    clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
    clf.fit(X, y)

    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, x_max]x[y_min, y_max].
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    plt.figure()
    plt.pcolormesh(xx, yy, Z, cmap=cmap_light)

    # Plot also the training points
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold,
                edgecolor='k', s=20)
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())
    plt.title("3-Class classification (k = %i, weights = '%s')"
              % (n_neighbors, weights))

plt.show()

但是例如,如果我尝试做完全相同的工作,更改变量名,则会出现一些错误,例如:

ValueError: query data dimension must match training data dimension

我必须对我使用的所有分类器进行这项工作。

更新:我不知道如何将scikit-learn中的示例改编为虹膜数据集,像这样的其他分类器,决策树:Click

编辑:我认为问题在于X_train,Y_train,X_test和Y_test的维度。我认为我只能以这种方式选择目标变量,但是我不确定

0 个答案:

没有答案