我在RandomSearchCV中不断收到AttributeError

时间:2018-06-25 07:15:09

标签: machine-learning scikit-learn jupyter-notebook data-science sklearn-pandas

x_tu = data_cls_tu.iloc[:,1:].values
y_tu = data_cls_tu.iloc[:,0].values

classifier = DecisionTreeClassifier()
parameters = [{"max_depth": [3,None],
               "min_samples_leaf": np.random.randint(1,9),
               "criterion": ["gini","entropy"]}]
randomcv = RandomizedSearchCV(estimator=classifier, param_distributions=parameters,
                              scoring='accuracy', cv=10, n_jobs=-1,
                              random_state=0)
randomcv.fit(x_tu, y_tu)



---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-17-fa8376cb54b8> in <module>()
     11                               scoring='accuracy', cv=10, n_jobs=-1,
     12                               random_state=0)
---> 13 randomcv.fit(x_tu, y_tu)

~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    616         n_splits = cv.get_n_splits(X, y, groups)
    617         # Regenerate parameter iterable for each fit
--> 618         candidate_params = list(self._get_param_iterator())
    619         n_candidates = len(candidate_params)
    620         if self.verbose > 0:

~\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py in __iter__(self)
    236         # in this case we want to sample without replacement
    237         all_lists = np.all([not hasattr(v, "rvs")
--> 238                             for v in self.param_distributions.values()])
    239         rnd = check_random_state(self.random_state)
    240 

AttributeError: 'list' object has no attribute 'values'

嗨,我在RandomSearchCV的fit方法上总是遇到错误。

当我在GridSearchCV上使用它们时,它可以工作,但是GridSearchCV需要5个小时才能完成。

x_tu,y_tu都是numpy.ndarray类型。

1 个答案:

答案 0 :(得分:4)

param_distributions必须是dict对象(documentation),但是您正在传递包含单个dict的列表。卸下外部方括号,然后应该可以正常工作。

应该是这样的:

parameters = {"max_depth": [3,None],
               "min_samples_leaf": [np.random.randint(1,9)],
               "criterion": ["gini","entropy"]}