我目前正在使用“内核密度估计”进行一些练习,并且试图运行这段代码:
from sklearn.datasets import load_digits
from sklearn.model_selection import GridSearchCV
digits = load_digits()
bandwidths = 10 ** np.linspace(0, 2, 100)
grid = GridSearchCV(KDEClassifier(), {'bandwidth': bandwidths}, cv=3)
grid.fit(digits.data, digits.target)
scores = [val.mean_validation_score for val in grid.cv_results_]
但是正如标题所述,我得到了
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-29-15a5f685e6d6> in <module>
8 grid.fit(digits.data, digits.target)
9
---> 10 scores = [val.mean_validation_score for val in grid.cv_results_]
<ipython-input-29-15a5f685e6d6> in <listcomp>(.0)
8 grid.fit(digits.data, digits.target)
9
---> 10 scores = [val.mean_validation_score for val in grid.cv_results_]
AttributeError: 'str' object has no attribute 'mean_validation_score'
关于mean_validation_score,我不明白为什么。由于运行了最新的scikit Learn软件包,因此代码直接从书中删除,并进行了一些更改。这是原始的代码片段:
from sklearn.datasets import load_digits
from sklearn.grid_search import GridSearchCV
digits = load_digits()
bandwidths = 10 ** np.linspace(0, 2, 100)
grid = GridSearchCV(KDEClassifier(), {'bandwidth': bandwidths})
grid.fit(digits.data, digits.target)
scores = [val.mean_validation_score for val in grid.grid_scores_]
编辑:
忘记添加bandwiths的定义方式:
from sklearn.base import BaseEstimator, ClassifierMixin
class KDEClassifier(BaseEstimator, ClassifierMixin):
"""Bayesian generative classification based on KDE
Parameters
----------
bandwidth : float
the kernel bandwidth within each class
kernel : str
the kernel name, passed to KernelDensity
"""
def __init__(self, bandwidth=1.0, kernel='gaussian'):
self.bandwidth = bandwidth
self.kernel = kernel
def fit(self, X, y):
self.classes_ = np.sort(np.unique(y))
training_sets = [X[y == yi] for yi in self.classes_]
self.models_ = [KernelDensity(bandwidth=self.bandwidth,
kernel=self.kernel).fit(Xi)
for Xi in training_sets]
self.logpriors_ = [np.log(Xi.shape[0] / X.shape[0])
for Xi in training_sets]
return self
def predict_proba(self, X):
logprobs = np.array([model.score_samples(X)
for model in self.models_]).T
result = np.exp(logprobs + self.logpriors_)
return result / result.sum(1, keepdims=True)
def predict(self, X):
return self.classes_[np.argmax(self.predict_proba(X), 1)]
答案 0 :(得分:0)
对象GridSearchCV
的{{3}}指定属性cv_results_
是一个字典,因此,在python字典上进行迭代将返回键的字符串,如您所愿{{3 }}。
我的建议是在GridSearchCV
构造函数中指定要使用的scoring
,然后看一下cv_results_
字典。
希望有帮助。
答案 1 :(得分:0)
很简单,我也遇到同样的问题,只需替换此行-
scores = [val.mean_test_score for val in grid.cv_results_]
使用
scores = grid.cv_results_.get('mean_test_score').tolist()
因为'mean_test_score'被描述,并且grid.cv_results_为dict格式。