我正在对RandomForestClassifier进行网格搜索,我的代码一直在工作,直到我更改了功能,突然代码生成了以下错误(在行classifier.fit上)
我没有更改任何代码,但将功能尺寸从16减少到8.我完全不知道应该注意什么。这个错误意味着什么?
错误:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 344, in __call__
return self.func(*args, **kwargs)
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/ensemble/forest.py", line 120, in _parallel_build_trees
tree.fit(X, y, sample_weight=curr_sample_weight, check_input=False)
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/tree/tree.py", line 739, in fit
X_idx_sorted=X_idx_sorted)
File "/home/zqz/Programs/anaconda3/lib/python3.5/site-packages/sklearn/tree/tree.py", line 246, in fit
raise ValueError("max_features must be in (0, n_features]")
ValueError: max_features must be in (0, n_features]
代码:
classifier = RandomForestClassifier(n_estimators=20, n_jobs=-1)
rfc_tuning_params = {"max_depth": [3, 5, None],
"max_features": [1, 3, 5, 7, 10],
"min_samples_split": [2, 5, 10],
"min_samples_leaf": [1, 3, 10],
"bootstrap": [True, False],
"criterion": ["gini", "entropy"]}
classifier = GridSearchCV(classifier, param_grid=rfc_tuning_params, cv=nfold,
n_jobs=cpus)
model_file = os.path.join(os.path.dirname(__file__), "random-forest_classifier-%s.m" % task)
classifier.fit(X_train, y_train) #line that causes the error
nfold_predictions=cross_val_predict(classifier.best_estimator_, X_train, y_train, cv=nfold)
答案 0 :(得分:2)
在rfc_tuning_params
,您有"max_features": [1, 3, 5, 7, 10]
。这包括10,它大于特征数量(8)。因此你得到错误
ValueError: max_features must be in (0, n_features]
所以你需要从"max_features"
删除10。