执行gridsearchcv后,为什么cross_val_score无法反映最佳分类器性能指标?

时间:2019-06-12 14:18:50

标签: python scikit-learn cross-validation grid-search weighted

cross_val_score输出不反映gridsearchcv的重新估算器。请跳到底部查看我的结果。

我执行了网格搜索嵌套的交叉验证,首先为模型选择最佳的超参数,然后使用cross_val_score评估模型(请参见下面的代码)。对于我的gridsearch,我选择了refit_score来调整估计量。对于我的cross_val_score,我有多个指标评估。

def grid_search_nested_cv(model,refit_score='precision_score'):
    """
    fits a GridSearchCV classifier using refit_score for optimization
    prints classifier performance metrics
    performs both an inner and outer cross validation
    """
    # To be used within GridSearch for parameter tuning
    inner_cv = KFold(n_splits=5, shuffle=True, random_state=42)

    # To be used in outer CV for model evaluation
    outer_cv = KFold(n_splits=5, shuffle=True, random_state=42)

    # Non_nested parameter search and scoring
    grid_search = GridSearchCV(model, param_grid, scoring=scorers, refit=refit_score,
                           cv=inner_cv, return_train_score=True, n_jobs=-1)   

    print('Best params for {}'.format(refit_score))
    print(grid_search.best_params_)

    score_types=['accuracy', 'f1_weighted', 'precision_weighted', 'recall_weighted']
    for scores_ in score_types:
        performance = cross_val_score(grid_search, X_df, y_df, scoring=scores_,cv=outer_cv, n_jobs=-1)
        print("{}: ".format(scores_) + str(round(100*performance.mean(), 2)) + "%")

    return grid_search


# MODEL 1 :RANDOM FOREST (PRECISION WEIGHTED)
# Initialize Random Forest Classifier model
clf = RandomForestClassifier(n_jobs=-1)

# use a full grid over interested parameters
param_grid = {
  'min_samples_split': [...], 
    'n_estimators' : [...],
    'max_depth': [...],
    'max_features': [..],
}

grid_search_clf = grid_search_nested_cv(clf,refit_score='precision_score')


# MODEL 2: RANDOM FOREST (RECALL WEIGHTED)
# Initialize Random Forest Classifier model
clf = RandomForestClassifier(n_jobs=-1)

# use a full grid over interested parameters
param_grid = {
   'min_samples_split': [...], 
    'n_estimators' : [...],
    'max_depth': [...],
    'max_features': [..],
}

grid_search_clf = grid_search_nested_cv(clf,refit_score='recall_score')
MODEL 1: Precision weighted
Best params for precision_score
{'max_depth': 20, 'max_features': 0.3, 'min_samples_split': 15, 'n_estimators': 10}
Accuracy: 67.24%
F1: 63.58%
Precision: 61.02%
Recall: 70.38%


MODEL 2: RECALL weighted
Best params for recall_score
{'max_depth': 10, 'max_features': 0.5, 'min_samples_split': 18, 'n_estimators': 6}
Accuracy: 66.66%
F1: 63.39%
Precision: 63.25%
Recall: 61.37%

Model 1 (precision weighted) has a higher recall score than model 2 (recall weighted)

Model 2 has a higher precision score than model 1. 

为什么会这样?应该反过来吗?

0 个答案:

没有答案