LDA作为Python中AdaBoost的基础学习者

时间:2018-03-15 18:45:44

标签: python scikit-learn classification adaboost

我正在使用AdaBoost进行多类分类,将基础学习器作为判别式(线性或二次)。我在scikit-learn或任何其他库中找不到任何实现此功能的功能,我该怎么做呢?

1 个答案:

答案 0 :(得分:1)

虽然scikit-learn AdaBoostClassifier允许您选择的基本估算工具(请参阅documentation),但它需要估算工具支持sample_weight。看看source

if not has_fit_parameter(self.base_estimator_, "sample_weight"):
    raise ValueError("%s doesn't support sample_weight."
                     % self.base_estimator_.__class__.__name__)

不幸的是,LinearDiscriminantAnalysisQuadraticDiscriminantAnalysis都不属于此类别。这是一个玩具示例:

from sklearn.ensemble import AdaBoostClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target)

clf = AdaBoostClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)

您将看到如下所示的追溯:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 411, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
  File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 128, in fit
    self._validate_estimator()
  File "//anaconda/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py", line 429, in _validate_estimator
    % self.base_estimator_.__class__.__name__)
ValueError: LinearDiscriminantAnalysis doesn't support sample_weight.

这是一个要求你不会在scikit-learn中解决的要求。文档清楚地表明这是一项艰难的要求:

  

&#34; ...需要支持样本加权,以及正确的classes_n_classes_属性。&#34;

然而,如果你的愿望只是使用一个整体,你总是可以使用套袋而不是提升:

from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=LDA())
clf.fit(X_train, y_train)