Question

我编写了一个函数来查找模型的混淆矩阵：

NN_model = KNeighborsClassifier(n_neighbors=1)
NN_model.fit(mini_train_data, mini_train_labels)
# Create the confusion matrix for the dev data
confusion = confusion_matrix(dev_labels, NN_model.predict(dev_data))
print(confusion)

但是我无法显示5位数以上的图像，这些图像经常与其他图像混淆。但是，当我尝试下面的代码时，我没有得到预期的结果。

index = 0
misclassifiedIndexes = []
for label, predict in zip(dev_labels, predictions):
     if label != predict: 
        misclassifiedIndexes.append(index)
        index +=1

plt.figure(figsize=(20,4))
for plotIndex, badIndex in enumerate(misclassifiedIndexes[0:5]):
    plt.subplot(1, 5, plotIndex + 1)
    plt.imshow(np.reshape(dev_data[badIndex], (28,28)), cmap=plt.cm.gray)
    plt.title('Predict: {}, Actual: {}'.format(predictions[badIndex], dev_labels[badIndex]), fontsize = 15)

能否请您看看我的代码出了什么问题？谢谢！

Answer 1

因此，我无法发布您的代码。因此，我在这里提供了可复制的代码。

您可以在预测和实际值之间的布尔比较中使用np.where。

尝试以下示例：

from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

X, y = load_digits(return_X_y=True)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)

NN_model = KNeighborsClassifier(n_neighbors=1)
NN_model.fit(X_train, y_train)
# Create the confusion matrix for the dev data
from sklearn.metrics import confusion_matrix
predictions = NN_model.predict(X_test)
confusion = confusion_matrix(y_test, predictions)

import matplotlib.pyplot as plt
misclassifiedIndexes = np.where(y_test!=predictions)[0]


fig, ax = plt.subplots(4, 3,figsize=(15,8))
ax = ax.ravel()
for i, badIndex in enumerate(misclassifiedIndexes):
    ax[i].imshow(np.reshape(X_test[badIndex], (8, 8)), cmap=plt.cm.gray)
    ax[i].set_title(f'Predict: {predictions[badIndex]}, '
                    f'Actual: {y_test[badIndex]}', fontsize = 10)
    ax[i].set(frame_on=False)
    ax[i].axis('off')
plt.box(False)
plt.axis('off')

显示混淆矩阵中分类错误的数字

1 个答案: