Question

问题

我正在尝试使用 scikit-learn LogisticRegressionCV和roc_auc_score作为评分指标。

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

clf = LogisticRegressionCV(scoring=roc_auc_score)

但是当我尝试拟合模型（clf.fit(X, y)）时，它会抛出一个错误。

 ValueError: average has to be one of (None, 'micro', 'macro', 'weighted', 'samples')

很酷。很清楚发生了什么：roc_auc_score需要使用指定的average参数，its documentation和上面的错误调用。所以我试过了。

clf = LogisticRegressionCV(scoring=roc_auc_score(average='weighted'))

但事实证明roc_auc_score不能仅使用可选参数调用，因为这会引发另一个错误。

TypeError: roc_auc_score() takes at least 2 arguments (1 given)

问题

有关如何使用roc_auc_score作为LogisticRegressionCV评分指标的任何想法，我可以指定评分函数的参数吗？

我无法在 scikit-learn 的GitHub回购中找到关于此问题的SO问题或对此问题的讨论，但是肯定有人之前遇到过此问题？

Answer 1

您可以使用make_scorer，例如

from sklearn.linear_model import LogisticRegressionCV
from sklearn.metrics import roc_auc_score, make_scorer
from sklearn.datasets import make_classification

# some example data
X, y = make_classification()

# little hack to filter out Proba(y==1)
def roc_auc_score_proba(y_true, proba):
    return roc_auc_score(y_true, proba[:, 1])

# define your scorer
auc = make_scorer(roc_auc_score_proba, needs_proba=True)

# define your classifier
clf = LogisticRegressionCV(scoring=auc)

# train
clf.fit(X, y)

# have look at the scores
print clf.scores_

Answer 2

我找到了解决这个问题的方法！

scikit-learn 在其make_scorer模块中提供metrics函数，允许用户使用其本机评分函数创建评分对象指定为非默认值（有关此功能的详细信息，请参阅{{3>}来自 scikit-learn 文档）。

因此，我创建了一个评分对象，其中指定了average参数。

roc_auc_weighted = sk.metrics.make_scorer(sk.metrics.roc_auc_score, average='weighted')

然后，我在调用LogisticRegressionCV的过程中传递了该对象，它没有任何问题！

clf = LogisticRegressionCV(scoring=roc_auc_weighted)

Answer 3

有点晚了（4 年后）。但是今天你可以使用：

clf = LogisticRegressionCV(scoring='roc_auc')

此外，所有其他评分键都可以通过以下方式获得：

from sklearn.metrics import SCORERS
print(SCORERS.keys())

如何在scikit-learn的LogisticRegressionCV调用中将参数传递给评分函数

3 个答案: