sklearn auc score - diff metrics.roc_auc_score& model_selection.cross_val_score

时间:2018-03-28 21:31:24

标签: python machine-learning scikit-learn auc

请温柔,对sklearn不熟悉。计算客户流失率,使用不同的roc_auc评分我得到3个不同的分数。得分1和3接近,这些和得分之间有显着差异2.感谢指导为什么会出现这样的差异,哪些可能是首选?非常感谢!

from sklearn.model_selection import cross_val_score
from sklearn.metrics import roc_auc_score


param_grid = {'n_estimators': range(10, 510, 100)}
grid_search = GridSearchCV(estimator=RandomForestClassifier(criterion='gini', max_features='auto',
                    random_state=20), param_grid=param_grid, scoring='roc_auc', n_jobs=4, iid=False, cv=5, verbose=0)
grid_search.fit(self.dataset_train, self.churn_train)
score_roc_auc = np.mean(cross_val_score(grid_search, self.dataset_test, self.churn_test, cv=5, scoring='roc_auc'))
"^^^ SCORE1 - 0.6395751751133528

pred = grid_search.predict(self.dataset_test)
score_roc_auc_2 = roc_auc_score(self.churn_test, pred)
"^^^ SCORE2 - 0.5063261397640454

print("grid best score ", grid_search.best_score_)
"^^^  SCORE3 - 0.6473102070034342

1 个答案:

答案 0 :(得分:0)

我相信这可以通过以下链接来回答,它指向GridSearchCV中的折叠和较小分割的得分?

Difference in ROC-AUC scores in sklearn RandomForestClassifier vs. auc methods