递归特征消除&网格搜索使用scikit-learn:DeprecationWarning

时间:2016-03-14 20:55:34

标签: scikit-learn feature-selection deprecation-warning grid-search

我正在构建多个分类器的网格搜索,并希望通过交叉验证使用递归功能消除。我开始使用Recursive feature elimination and grid search using scikit-learn中提供的代码。以下是我的工作代码:

param_grid = [{'C': 0.001}, {'C': 0.01}, {'C': .1}, {'C': 1.0}, {'C': 10.0},
              {'C': 100.0}, {'fit_intercept': True}, {'fit_intercept': False},
              {'penalty': 'l1'}, {'penalty': 'l2'}]

estimator = LogisticRegression()
selector = RFECV(estimator, step=1, cv=5, scoring="roc_auc")
clf = grid_search.GridSearchCV(selector, {"estimator_params": param_grid},
                               cv=5, n_jobs=-1)
clf.fit(X,y)
print clf.best_estimator_.estimator_
print clf.best_estimator_.ranking_
print clf.best_estimator_.score(X, y)

我收到了DeprecationWarning,因为看起来“estimator_params”参数在0.18中被删除了;我想弄清楚在第4行使用的正确语法。

试...

param_grid = [{'C': 0.001}, {'C': 0.01}, {'C': .1}, {'C': 1.0}, {'C': 10.0},
              {'C': 100.0}, {'fit_intercept': True}, {'fit_intercept': False},
              {'fit_intercept': 'l1'}, {'fit_intercept': 'l2'}]
clf = grid_search.GridSearchCV(selector, param_grid,
                               cv=5, n_jobs=-1)

返回ValueError:参数值应为列表。和...

param_grid = {"penalty": ["l1","l2"],
           "C": [.001,.01,.1,1,10,100],
           "fit_intercept": [True, False]}
clf = grid_search.GridSearchCV(selector, param_grid,
                               cv=5, n_jobs=-1)

返回ValueError:估算器RFECV的参数代价无效。使用estimator.get_params().keys()检查可用参数列表。检查键显示所有3个“C”,“fit_intercept”和“penalty”作为参数键。尝试...

param_grid = {"estimator__C": [.001,.01,.1,1,10,100],
              "estimator__fit_intercept": [True, False],
              "estimator__penalty": ["l1","l2"]}
clf = grid_search.GridSearchCV(selector, param_grid,
                               cv=5, n_jobs=-1)

永远不会完成执行,所以我猜测不支持这种类型的参数赋值。

至于现在我设置忽略警告但我想用0.18的适当语法更新代码。任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:0)

回答之前在SO上发布的问题:https://stackoverflow.com/a/35560648/5336341。感谢Paulo Alves获得答案。

相关代码:

params = {'estimator__max_depth': [1, 5, None],
          'estimator__class_weight': ['balanced', None]}
estimator = DecisionTreeClassifier()
selector = RFECV(estimator, step=1, cv=3, scoring='accuracy')
clf = GridSearchCV(selector, params, cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_

要了解更多信息,请使用:

print(selector.get_params())