Question

我正在使用scikit-learn在网格搜索（GridSearchCV）中进行基于树的方法的递归特征消除（RFECV）。为此，我在GitHub（0.17）上使用当前开发版本，这允许RFECV使用树方法中的特征重要性来选择要丢弃的特征。

为清楚起见，这意味着：

循环当前树方法的超参数
为每组参数执行递归特征消除以获得最佳特征数
报告＆＃39;得分＆＃39; （例如准确性）
确定哪组参数产生最佳分数

此代码目前工作正常 - 但我收到有关使用estimator_params的折旧警告。这是当前的代码：

# set up list of parameter dictionaries (better way to do this?)
depth = [1, 5, None]
weight = ['balanced', None]
params = []

for d in depth:
    for w in weight:
    params.append(dict(max_depth=d, 
                       class_weight=w))

# specify the classifier
estimator = DecisionTreeClassifier(random_state=0, 
                                   max_depth=None, 
                                   class_weight='balanced')

# specify the feature selection method
selector = RFECV(estimator,
                 step=1, 
                 cv=3, 
                 scoring='accuracy')

# set up the parameter search
clf = GridSearchCV(selector, 
                   {'estimator_params': param_grid}, 
                   cv=3)

clf.fit(X_train, y_train)

clf.best_estimator_.estimator_

以下是完整的折旧警告：

home/csw34/git/scikit-learn/sklearn/feature_selection/rfe.py:154: DeprecationWarning:

The parameter 'estimator_params' is deprecated as of version 0.16 and will be removed in 0.18. The parameter is no longer necessary because the value is set via the estimator initialisation or set_params method.

如果不使用GridSearchCV中的estimator_params将参数通过RFECV传递给估算器，我将如何获得相同的结果？

Answer 1

这解决了您的问题：

params = {'estimator__max_depth': [1, 5, None],
          'estimator__class_weight': ['balanced', None]}
estimator = DecisionTreeClassifier()
selector = RFECV(estimator, step=1, cv=3, scoring='accuracy')
clf = GridSearchCV(selector, params, cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_

要了解更多信息，请使用：

print(selector.get_params())

使用嵌套在GridSearchCV中的RFECV时，如何避免使用estimator_params？

1 个答案: