Question

我正在使用scikit-learn集成分类器进行分类。我有单独的训练和测试数据集。当我使用相同的数据集并使用机器学习算法进行分类时，我得到了一致的准确性。不一致只在集合分类器的情况下。我甚至将random_state设置为0.

bag_classifier = BaggingClassifier(n_estimators=10,random_state=0)
bag_classifier.fit(train_arrays,train_labels)   
bag_predict = bag_classifier.predict(test_arrays)  
bag_accuracy = bag_classifier.score(test_arrays,test_labels)   
bag_cm = confusion_matrix(test_labels,bag_predict)   
print("The Bagging Classifier accuracy is : " ,bag_accuracy)   
print("The Confusion Matrix is ")  
print(bag_cm)

Answer 1

You will normally find different results for same model because every time when the model is executed during training, the train/test split is random. You can reproduce the same results by giving the seed value to the train/test split.

train, test = train_test_split(your data , test_size=0.3,  random_state=57)

Keep the same random_state value in each turn of training.

Scikit学习准确性的偏差

1 个答案: