Question

我正在尝试在具有年龄组和种族组的多类标签的数据集上构建Adaboost模型。

由于我计划获得ROC和AUC，因此我将目标变量二值化为yb_train2（针对年龄段）和yb_train3（针对种族）。然后我在决策树模型中尝试了“一对多”，效果很好。

但是现在我不知道如何在网格搜索中指定参数，我尝试了以下代码，并得到语法错误：

abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier()))


param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
                  base_estimator__estimator__splitter = ["best", "random"],
                  n_estimators = [1, 2],
                  learning_rate =  [0.0001,0.001,0.01,0.1,1]
                  )

grid = GridSearchCV(abc,param_grid)

grid.fit(X_train,yb_train2)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter: {:}').format(grid.best_params_)

grid.fit(X_train,yb_train3)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter:{:}').format(grid.best_params_)

在这种情况下有人可以提出一些建议吗？谢谢：）

Answer 1

您错过了方括号之一的关闭。

from sklearn.ensemble import AdaBoostClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV

abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier())))

param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
                  base_estimator__estimator__splitter = ["best", "random"],
                  n_estimators = [1, 2],
                  learning_rate =  [0.0001,0.001,0.01,0.1,1]
                  )

grid = GridSearchCV(abc,param_grid)

这可以帮助您克服语法错误。

但是默认情况下DecisionTreeClassifier是多类分类器，因此，我建议不要在其之上使用oneVsRestclassifier包装器。

是否可以将base_estimator设置为Adaboost的OneVsRestClassifier（DecisionTreeClassifier（））？

1 个答案: