Question

在这个简化的例子中，我用GridSearchCV训练了一个学习者。我想在预测全集X时返回最佳学习者的混淆矩阵。

>>> base = example.Base()
>>> example.base_value(base)
>>> example.base_ref(base)
>>> example.base_cref(base)
>>> example.shared_base_value(base)
>>> example.shared_base_ref(base)
>>> example.shared_base_cref(base)
>>> 
>>> derived = example.Derived()
>>> example.base_value(derived)
>>> example.base_ref(derived)
>>> example.base_cref(derived)
>>> example.shared_base_value(derived)
>>> try:
...     got_exception = False
...     example.shared_base_ref(derived)
... except TypeError:
...     got_exception = True
... finally:
...     assert(got_exception)
...
>>> example.shared_base_cref(derived)
>>> example.derived_value(derived)
>>> example.derived_ref(derived)
>>> example.derived_cref(derived)
>>> example.shared_derived_value(derived)
>>> example.shared_derived_ref(derived)
>>> example.shared_derived_cref(derived)

由于

Answer 1

您首先需要使用GridSerarchCV中的最佳估算值进行预测。常用的方法是GridSearchCV.decision_function()，但是对于您的示例，decision_function会从LogisticRegression返回类概率，但不能与confusion_matrix一起使用。相反，使用lr_gs找到最佳估算器，并使用该估算器预测标签。

y_pred = lr_gs.best_estimator_.predict(X)

最后，在真实和预测confusion_matrix

上使用sklearn的y

from sklearn.metrics import confusion_matrix
print confusion_matrix(y, y_pred)

Answer 2

我在寻找如何在拟合Sci-kit Learn的GridSearchCV时计算混淆矩阵时发现了这个问题。我可以通过定义自定义评分功能来找到解决方案，尽管有些麻烦。我将这个答案留给进行类似搜索的其他人。

如@MLgeek和@ bugo99iot所述，@ Sudeep Juvekar接受的答案并不十分令人满意。它提供了对所提出的原始问题的字面答案，但是机器学习从业人员通常不会对训练数据上的拟合模型的混淆矩阵感兴趣。通常更有趣的是知道模型对未见数据的概括程度。

要在GridSearchCV中使用自定义评分功能，您将需要导入Scikit学习助手功能make_scorer。

from sklearn.metrics import make_scorer

自定义评分功能如下

def _count_score(y_true, y_pred, label1=0, label2=1):
    return sum((y == label1 and pred == label2)
                for y, pred in zip(y_true, y_pred))

对于给定的一对标签(label1, label2)，它计算y的真实值为label1而y的预测值为{ {1}}。

首先，在训练数据中找到所有标签

label2

all_labels = sorted(set(y))的可选参数scoring可以接收将字符串映射到记分器的字典。 make_scorer可以采用评分功能以及对其某些参数的绑定，并生成一个计分器，这是一种特殊的可调用类型，用于在GridSearchCV，GridSearchCV等中进行计分。让我们为每对标签建立一个字典。

cross_val_score

您还将希望添加您感兴趣的任何其他评分功能。为了避免陷入多类分类评分的微妙之处，我们添加一个简单的准确性评分。

scorer = {}
for label1 in all_labels:
    for label2 in all_labels:
        count_score = make_scorer(_count_score, label1=label1,
                                  label2=label2)
        scorer['count_%s_%s' % (label1, label2)] = count_score

我们现在可以容纳# import placed here for the sake of demonstration. # Should be imported alongside make_scorer above from sklearn.metrics import accuracy_score scorer['accuracy'] = make_scorer(accuracy_score)

GridSearchCV

num_splits = 5 lr_gs = GridSearchCV(lr_pipeline, lr_parameters, n_jobs=-1, scoring=scorer, refit='accuracy', cv=num_splits)告诉refit='accuracy'，应该以最佳准确度分数来判断，以确定重新安装时要使用的参数。如果您将多个得分者的字典传递给GridSearchCV，则如果您没有将值传递给可选参数scoring，则refit不会在所有训练数据上重新拟合模型。我们已经明确设置了分割数，因为稍后我们需要知道这一点。

现在，对于交叉验证中使用的每个训练折叠，基本上我们要做的是在各个测试折叠上计算混淆矩阵。测试折叠不会重叠，不会覆盖整个数据空间，因此，我们对GridSearchCV中的每个数据点进行了预测，以使每个点的预测都不依赖于相关联的目标标签点。

我们可以将与测试褶皱相关的混淆矩阵加在一起，以获得有用的信息，这些信息可以提供有关模型概括性的信息。单独查看测试折叠的混淆矩阵并进行诸如计算方差之类的工作也很有趣。

我们还没有完成。我们实际上需要拉出混淆矩阵以获得最佳估计量。在此示例中，交叉验证结果将存储在字典X中。首先，让我们获取与最佳参数集相对应的结果中的索引

lr_gs.cv_results

如果您使用其他度量标准来确定最佳参数，请在传递给best_index = lr_gs.cv_results['rank_test_accuracy'] - 1的评分词典中，用“准确性”代替相关评分器使用的键。

在我自己的应用程序中，我选择将混淆矩阵存储为嵌套字典。

GridSearchCV

这里有些东西要打开。 confusion = defaultdict(lambda: defaultdict(int)) for label1 in all_labels: for label2 in all_labels for i in range(num_splits): key = 'split%s_test_count_%s_%s' % (i, label1, label2) val = int(lr_gs.cv_results[key][best_index]) confusion[label1][label2] += val confusion = {key: dict(value) for key, value in confusion.items()}构造一个嵌套的defaultdict(lambda: defaultdict(int))； {{1}中的defaultdict中的defaultdict（如果要复制和粘贴，请不要忘记在文件顶部添加defaultdict）。此代码段的最后一行用于将int转换为from collections import defaultdict中confusion的常规dict。不再需要dict时，不要把它们闲逛。

您可能希望以其他方式存储混淆矩阵。关键事实是用于测试折叠int的一对标签defaultdict，'label1'的混淆矩阵条目存储在

'label2'

有关在实践中使用的混淆矩阵计算的示例，请参见here。我认为依赖i字典中键的特定格式有点代码味，但这至少在本文发布之日有效。

Sci-kit：使用GridSearchCV时，获取估算器混淆矩阵的最简单方法是什么？

2 个答案: