mean_validation_score给出AttributeError

时间:2019-08-12 15:29:38

标签: python scikit-learn jupyter-notebook

我目前正在使用“内核密度估计”进行一些练习,并且试图运行这段代码:

from sklearn.datasets import load_digits
from sklearn.model_selection import GridSearchCV

digits = load_digits()

bandwidths = 10 ** np.linspace(0, 2, 100)
grid = GridSearchCV(KDEClassifier(), {'bandwidth': bandwidths}, cv=3)
grid.fit(digits.data, digits.target)

scores = [val.mean_validation_score for val in grid.cv_results_]

但是正如标题所述,我得到了

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-29-15a5f685e6d6> in <module>
      8 grid.fit(digits.data, digits.target)
      9 
---> 10 scores = [val.mean_validation_score for val in grid.cv_results_] 

<ipython-input-29-15a5f685e6d6> in <listcomp>(.0)
      8 grid.fit(digits.data, digits.target)
      9 
---> 10 scores = [val.mean_validation_score for val in grid.cv_results_] 
AttributeError: 'str' object has no attribute 'mean_validation_score'

关于mean_validation_score,我不明白为什么。由于运行了最新的scikit Learn软件包,因此代码直接从书中删除,并进行了一些更改。这是原始的代码片段:

from sklearn.datasets import load_digits
from sklearn.grid_search import GridSearchCV

digits = load_digits()

bandwidths = 10 ** np.linspace(0, 2, 100)
grid = GridSearchCV(KDEClassifier(), {'bandwidth': bandwidths})
grid.fit(digits.data, digits.target)

scores = [val.mean_validation_score for val in grid.grid_scores_]

编辑:

忘记添加bandwiths的定义方式:

from sklearn.base import BaseEstimator, ClassifierMixin


class KDEClassifier(BaseEstimator, ClassifierMixin):
    """Bayesian generative classification based on KDE

    Parameters
    ----------
    bandwidth : float
        the kernel bandwidth within each class
    kernel : str
        the kernel name, passed to KernelDensity
    """
    def __init__(self, bandwidth=1.0, kernel='gaussian'):
        self.bandwidth = bandwidth
        self.kernel = kernel

    def fit(self, X, y):
        self.classes_ = np.sort(np.unique(y))
        training_sets = [X[y == yi] for yi in self.classes_]
        self.models_ = [KernelDensity(bandwidth=self.bandwidth,
                                        kernel=self.kernel).fit(Xi)
                        for Xi in training_sets]
        self.logpriors_ = [np.log(Xi.shape[0] / X.shape[0])
                            for Xi in training_sets]
        return self

    def predict_proba(self, X):
        logprobs = np.array([model.score_samples(X)
                                for model in self.models_]).T
        result = np.exp(logprobs + self.logpriors_)
        return result / result.sum(1, keepdims=True)

    def predict(self, X):
        return self.classes_[np.argmax(self.predict_proba(X), 1)]

2 个答案:

答案 0 :(得分:0)

对象GridSearchCV的{​​{3}}指定属性cv_results_是一个字典,因此,在python字典上进行迭代将返回键的字符串,如您所愿{{3 }}。

我的建议是在GridSearchCV构造函数中指定要使用的scoring,然后看一下cv_results_字典。

希望有帮助。

答案 1 :(得分:0)

很简单,我也遇到同样的问题,只需替换此行-

scores = [val.mean_test_score for val in grid.cv_results_]

使用

scores = grid.cv_results_.get('mean_test_score').tolist()

因为'mean_test_score'被描述,并且grid.cv_results_为dict格式。