我正在尝试在具有年龄组和种族组的多类标签的数据集上构建Adaboost模型。
由于我计划获得ROC和AUC,因此我将目标变量二值化为yb_train2(针对年龄段)和yb_train3(针对种族)。然后我在决策树模型中尝试了“一对多”,效果很好。
但是现在我不知道如何在网格搜索中指定参数,我尝试了以下代码,并得到语法错误:
abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier()))
param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
base_estimator__estimator__splitter = ["best", "random"],
n_estimators = [1, 2],
learning_rate = [0.0001,0.001,0.01,0.1,1]
)
grid = GridSearchCV(abc,param_grid)
grid.fit(X_train,yb_train2)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter: {:}').format(grid.best_params_)
grid.fit(X_train,yb_train3)
print ('best score: {:}').format(grid.best_score_ ), ('with parameter:{:}').format(grid.best_params_)
答案 0 :(得分:0)
您错过了方括号之一的关闭。
from sklearn.ensemble import AdaBoostClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
abc = AdaBoostClassifier(base_estimator= (OneVsRestClassifier(DecisionTreeClassifier())))
param_grid = dict(base_estimator__estimator__criterion = ["gini", "entropy"],
base_estimator__estimator__splitter = ["best", "random"],
n_estimators = [1, 2],
learning_rate = [0.0001,0.001,0.01,0.1,1]
)
grid = GridSearchCV(abc,param_grid)
这可以帮助您克服语法错误。
但是默认情况下DecisionTreeClassifier是多类分类器,因此,我建议不要在其之上使用oneVsRestclassifier包装器。