Sklearn:有没有一种方法可以为管道定义特定的分数类型?

时间:2020-04-29 03:47:48

标签: python python-3.x scikit-learn pipeline

我可以这样做:

model=linear_model.LogisticRegression(solver='lbfgs',max_iter=10000)
kfold = model_selection.KFold(n_splits=number_splits,shuffle=True, random_state=random_state)
scalar = StandardScaler()
pipeline = Pipeline([('transformer', scalar), ('estimator', model)])
results = model_selection.cross_validate(pipeline, X, y, cv=kfold, scoring=score_list,return_train_score=True)

其中score_list可以类似于['accuracy','balanced_accuracy','precision','recall','f1']

我也可以这样做:

kfold = model_selection.KFold(n_splits=number_splits,shuffle=True, random_state=random_state)
scalar = StandardScaler()
pipeline = Pipeline([('transformer', scalar), ('estimator', model)])
for i, (train, test) in enumerate(kfold.split(X, y)):
    pipeline.fit(self.X[train], self.y[train])
    pipeline.score(self.X[test], self.y[test])

但是,我无法在最后一行更改管道的得分类型。我该怎么办?

1 个答案:

答案 0 :(得分:3)

score方法始终使用accuracy进行分类,使用r2得分进行回归。没有参数可以更改。它来自ClassifiermixinRegressorMixin

相反,当我们需要其他评分选项时,我们必须像下面这样从sklearn.metrics导入它。

from sklearn.metrics import balanced_accuracy

y_pred=pipeline.score(self.X[test])
balanced_accuracy(self.y_test, y_pred)