GridsearchCV与RandomForest

时间:2017-07-30 18:51:38

标签: python-3.x machine-learning random-forest grid-search

所以我正在使用RandomForest和GridsearchCV做一些参数。这是我的代码。

#Import 'GridSearchCV' and 'make_scorer'
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer

Create the parameters list you wish to tune
parameters = {'n_estimators':[5,10,15]}

#Initialize the classifier
clf = GridSearchCV(RandomForestClassifier(), parameters)

#Make an f1 scoring function using 'make_scorer' 
f1_scorer = make_scorer(f1_scorer)

#Perform grid search on the classifier using the f1_scorer as the scoring method
grid_obj = GridSearchCV(clf, param_grid=parameters, scoring=f1_scorer,cv=5)

print(clf.get_params().keys())

#Fit the grid search object to the training data and find the optimal parameters
grid_obj = grid_obj.fit(X_train_100,y_train_100)

所以问题是以下错误:" ValueError:估算器GridSearchCV的参数max_features无效。使用estimator.get_params().keys()检查可用参数列表。"

我按照错误给出的建议和print的输出(clf.get_params()。keys())如下所示。然而,即使我将这些标题复制并粘贴到我的参数字典中,我仍然会收到错误。我一直在寻找堆栈溢出,大多数人都使用非常类似的参数字典来挖掘。任何人都知道如何解决这个问题?再次感谢!

dict_keys([' pre_dispatch',' cv',' estimator__max_features',' param_grid',' refit' ,' estimator__min_impurity_split',' n_jobs',' estimator__random_state',' error_score',' verbose',' estimator__min_samples_split',' estimator__n_jobs',' fit_params',' estimator__min_weight_fraction_leaf','评分',' estimator__warm_start', ' estimator__criterion',' estimator__verbose',' estimator__bootstrap',' estimator__class_weight',' estimator__oob_score',' iid& #39;,' estimator',' estimator__max_depth',' estimator__max_leaf_nodes',' estimator__min_samples_leaf',' estimator__n_estimators',& #39; return_train_score'])

1 个答案:

答案 0 :(得分:2)

我认为问题在于两行:

clf = GridSearchCV(RandomForestClassifier(), parameters)
grid_obj = GridSearchCV(clf, param_grid=parameters, scoring=f1_scorer,cv=5)

这实际上是在创建一个具有以下结构的对象:

grid_obj = GridSearchCV(GridSearchCV(RandomForestClassifier()))

可能比你想要的还要多GridSearchCV