我尝试使用 split 函数来拆分数据集并迭代到它们中并预测每次迭代的混淆矩阵,并取所有混淆矩阵的平均值。 我必须构建一个函数来这样做,它适用于我的其他项目,它的预测类为 (0,1),但这不适用于 (1,2) 的预测类
这是我正在使用的数据的 head()
Age Gender TB DB Alkphos Sgpt Sgot TP ALB A/G Class
0 65 Female 0.7 0.1 187 16 18 6.8 3.3 0.90 1
1 62 Male 10.9 5.5 699 64 100 7.5 3.2 0.74 1
2 62 Male 7.3 4.1 490 60 68 7.0 3.3 0.89 1
3 58 Male 1.0 0.4 182 14 20 6.8 3.4 1.00 1
4 72 Male 3.9 2.0 195 27 59 7.3 2.4 0.40 1
def total_confusion_matrix(model,x,y):
total_matrix=[]
cross = RepeatedStratifiedKFold(n_splits=5, n_repeats=10, random_state=1)
for train_i, test_i in cross.split(x,y):
x_train, x_test = x[train_i], x[test_i]
y_train, y_test = y[train_i], y[test_i]
model.fit(x_train, y_train)
total_matrix.append(confusion_matrix(y_test, model.predict(x_test)))
print('The total of confusion matrix in cross validation:')
sns.heatmap(sum(total_matrix), annot=True)
plt.title('Total Confusion Matrix')
plt.show()
print('\n The Mean confusion matrix in cross Validation:')
sns.heatmap(sum(total_matrix)/len(total_matrix), annot=True)
plt.title('Mean Confusion Matrix')
plt.show()
total_confusion_matrix(rf_best,x,y)
KeyError: "None of [Int64Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 11,\n ...\n 569, 570, 571, 572, 573, 574, 576, 577, 578, 580],\n dtype='int64', length=464)] are in the [columns]"