如何使用几个base_estimator定义AdaBoostRegressor? 我的代码在下面...
# read data and label from TrainFile.
data,label=file.reade_train_file(rouge,TrainFile)
tuned_parameters = [{
'loss' : ['exponential']
,'random_state' : [47]
,'learning_rate' : [1]
}]
base_models = [ExtraTreesRegressor(n_estimators= 350
, criterion= 'mse'
,max_features = 'log2'
,random_state = 40), RandomForestRegressor(n_estimators= 900
, criterion= 'mse'
,max_features = 'sqrt'
,min_samples_split = 3
,random_state = 40)]
clf = GridSearchCV(AdaBoostRegressor(base_models), tuned_parameters, cv=4)
clf.fit(data,label)
错误是:
> Traceback (most recent call last):
File "/home/aliasghar/MySumFarsi/sumFarsi/prjSumFarsi/Documents_References.py", line 956, in <module>
documents_References.train(1)
File "/home/aliasghar/MySumFarsi/sumFarsi/prjSumFarsi/Documents_References.py", line 886, in train
self.get_best_AdaBoostRegressor_for_train(rouge,TrainFile)
File "/home/aliasghar/MySumFarsi/sumFarsi/prjSumFarsi/Documents_References.py", line 289, in get_best_AdaBoostRegressor_for_train
clf.fit(data,label)
File "/usr/local/lib/python3.5/dist-packages/sklearn/model_selection/_search.py", line 638, in fit
cv.split(X, y, groups)))
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__
while self.dispatch_one_batch(iterator):
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch
self._dispatch(tasks)
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async
result = ImmediateResult(func)
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__
self.results = batch()
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/usr/local/lib/python3.5/dist-packages/sklearn/model_selection/_validation.py", line 437, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/usr/local/lib/python3.5/dist-packages/sklearn/ensemble/weight_boosting.py", line 960, in fit
return super(AdaBoostRegressor, self).fit(X, y, sample_weight)
File "/usr/local/lib/python3.5/dist-packages/sklearn/ensemble/weight_boosting.py", line 145, in fit
random_state)
File "/usr/local/lib/python3.5/dist-packages/sklearn/ensemble/weight_boosting.py", line 1006, in _boost
estimator = self._make_estimator(random_state=random_state)
File "/usr/local/lib/python3.5/dist-packages/sklearn/ensemble/base.py", line 126, in _make_estimator
estimator.set_params(**dict((p, getattr(self, p))
AttributeError: 'list' object has no attribute 'set_params'
答案 0 :(得分:2)
如果我正确理解了您的问题,则希望在AdaBoost上应用GridSearchCV,并提供使用不同基本回归变量的选项。我认为您正在寻找类似的东西
首先,定义您的基本评估者列表
base_models = [ExtraTreesRegressor(n_estimators= 5,
criterion= 'mse',
max_features = 'log2',
random_state = 40),
RandomForestRegressor(n_estimators= 5,
criterion= 'mse',
max_features = 'sqrt',
min_samples_split = 3,
random_state = 40)]
然后定义要调整的参数,然后将base model
添加为单独的参数(还要确保将参数存储在字典中而不是列表中)
tuned_parameters = { 'base_estimator':base_models,
'loss' : ['exponential']
,'random_state' : [47]
,'learning_rate' : [1]
}
clf = GridSearchCV(AdaBoostRegressor(), tuned_parameters, cv=4)
clf.fit(data,label)
如果您尝试同时使用多个回归器,则按照@Jan K的建议,是不可能的。