是否可以将base_estimator设置为Adaboost的OneVsRestClassifier(DecisionTreeClassifier())?

时间:2019-03-09 09:01:35

标签: python machine-learning scikit-learn grid-search adaboost

我正在尝试在具有年龄组和种族组的多类标签的数据集上构建Adaboost模型。

由于我计划获得ROC和AUC,因此我将目标变量二值化为yb_train2(针对年龄段)和yb_train3(针对种族)。然后我在决策树模型中尝试了“一对多”,效果很好。

但是现在我不知道如何在网格搜索中指定参数,我尝试了以下代码,并得到语法错误:

abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier()))


param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
                  base_estimator__estimator__splitter = ["best", "random"],
                  n_estimators = [1, 2],
                  learning_rate =  [0.0001,0.001,0.01,0.1,1]
                  )

grid = GridSearchCV(abc,param_grid)

grid.fit(X_train,yb_train2)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter: {:}').format(grid.best_params_)

grid.fit(X_train,yb_train3)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter:{:}').format(grid.best_params_)

enter image description here 在这种情况下有人可以提出一些建议吗?谢谢:)

1 个答案:

答案 0 :(得分:0)

您错过了方括号之一的关闭。

from sklearn.ensemble import AdaBoostClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV

abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier())))

param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
                  base_estimator__estimator__splitter = ["best", "random"],
                  n_estimators = [1, 2],
                  learning_rate =  [0.0001,0.001,0.01,0.1,1]
                  )

grid = GridSearchCV(abc,param_grid)      

这可以帮助您克服语法错误。

但是默认情况下DecisionTreeClassifier是多类分类器,因此,我建议不要在其之上使用oneVsRestclassifier包装器。