使用adaboost在sklearn中具有重要性

时间:2014-01-26 13:14:12

标签: python classification scikit-learn

我正在唱python库sklearn。我正在使用adaboost分类器,并希望确定哪些功能在分类中最重要。以下是我的代码:

ada =    AdaBoostClassifier(n_estimators=100)
selector = RFECV(ada, step=1, cv=5) 
selector = selector.fit(np.asarray(total_data), np.asarray(target))
selector.support_
print "featue ranking", selector.ranking_

我收到以下错误:

 selector = selector.fit(np.asarray(total_data), np.asarray(target))
  File "C:\Python27\lib\site-packages\sklearn\feature_selection\rfe.py", line 336, in fit
    ranking_ = rfe.fit(X_train, y_train).ranking_
  File "C:\Python27\lib\site-packages\sklearn\feature_selection\rfe.py", line 148, in fit
    if estimator.coef_.ndim > 1:
AttributeError: 'AdaBoostClassifier' object has no attribute 'coef_'

有没有人对它有任何了解,以及如何纠正它。

谢谢!

1 个答案:

答案 0 :(得分:3)

直接来自RFECV的文档字符串:

Parameters
----------
estimator : object
    A supervised learning estimator with a `fit` method that updates a
    `coef_` attribute that holds the fitted parameters. Important features
    must correspond to high absolute values in the `coef_` array.

    For instance, this is the case for most supervised learning
    algorithms such as Support Vector Classifiers and Generalized
    Linear Models from the `svm` and `linear_model` modules.

换句话说,RFE目前仅针对线性模型实施。您可以将其更改为使用feature_importances_代替coef_并提交补丁,以使其适用于其他模型。