KFold和GridsearchCV可以一起使用,还是会导致过度拟合?
我尝试过独立(依次)和组合(在一个命令行中)使用KFold和GridsearchCV,但是不知道它对我的混乱矩阵是否有任何影响,因为这两种方法仅返回TP和FP。混淆矩阵没有差异。参数优化也不会改变,只是将TP和FP移到分配了相同编号的TN和FN上。
from sklearn import datasets, svm
from sklearn.metrics import roc_auc_score
kf = KFold(n_splits=10, shuffle=False, random_state=None)
parameter_grid = [
{'C': [0.1, 1, 10, 50], 'kernel':['linear'],
'gamma': [0.0001, 0.001, 0.01, 0.1, 1],
'class_weight': ['balanced', {1:2}, {1:3}, {1:4}, {1:5}]
},
{'C': [0.1, 1, 10, 50], 'kernel':['rbf'],
'gamma': [0.0001, 0.001, 0.01, 0.1, 1],
'class_weight': ['balanced', {1:2}, {1:3}, {1:4}, {1:5}]
}
]
# Generate an accuracy object with the classifier and grid parameters.
clf_stand_acc = GridSearchCV(clf_svc,
param_grid = parameter_grid,
cv=kf,
n_jobs = -1)
clf_stand_acc.fit(X_train_std, y_train)
y_predict_acc = clf_stand_acc.predict(X_test)
# Generate an auc object with the classifier and grid parameters.
clf_stand_auc = GridSearchCV(clf_svc,
param_grid = parameter_grid,
cv=kf,
n_jobs = -1,
scoring = 'roc_auc')
clf_stand_auc.fit(X_train_std, y_train)
y_predict_auc = clf_stand_auc.predict(X_test)
我得到了以下内容:
要训练的实例数:456 测试实例数:114 SVC列车精度0.63158 SVC测试准确度0.67544 精确召回f1得分支持
0 0.00 0.00 0.00 37
1 0.68 1.00 0.81 77
平均/总计0.46 0.68 0.54 114
[[0 37] [0 77]]
我期望如下所示:
要训练的实例数:456 测试实例数:114 SVC列车精度0.79158 SVC测试准确度0.77544 精确召回f1得分支持
0 0.00 0.00 0.00 37
1 0.68 1.00 0.89 77
平均/总计0.59 0.68 0.74 114
[[35 2] [1 76]]