Sagemaker超参数优化XGBoost

时间:2018-06-22 09:42:02

标签: amazon-sagemaker

我正在尝试在python中的Amazon Sagemaker中构建超参数优化作业,但某些方法不起作用。这是我所拥有的:

sess = sagemaker.Session()

xgb = sagemaker.estimator.Estimator(containers[boto3.Session().region_name],
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.m4.4xlarge',
                                    output_path=output_path_1,
                                    base_job_name='HPO-xgb',
                                    sagemaker_session=sess)

from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter    

hyperparameter_ranges = {'eta': ContinuousParameter(0.01, 0.2),
                         'num_rounds': ContinuousParameter(100, 500),
                         'num_class':  4,
                         'max_depth': IntegerParameter(3, 9),
                         'gamma': IntegerParameter(0, 5),
                         'min_child_weight': IntegerParameter(2, 6),
                         'subsample': ContinuousParameter(0.5, 0.9),
                         'colsample_bytree': ContinuousParameter(0.5, 0.9)}

objective_metric_name = 'validation:mlogloss'
objective_type='minimize'
metric_definitions = [{'Name': 'validation-mlogloss',
                       'Regex': 'validation-mlogloss=([0-9\\.]+)'}]

tuner = HyperparameterTuner(xgb,
                            objective_metric_name,
                            objective_type,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=9,
                            max_parallel_jobs=3)

tuner.fit({'train': s3_input_train, 'validation': s3_input_validation}) 

我得到的错误是:

AttributeError: 'str' object has no attribute 'keys'

错误似乎来自tuner.py文件:

----> 1 tuner.fit({'train': s3_input_train, 'validation': s3_input_validation})

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in fit(self, inputs, job_name, **kwargs)
    144             self.estimator._prepare_for_training(job_name)
    145 
--> 146         self._prepare_for_training(job_name=job_name)
    147         self.latest_tuning_job = _TuningJob.start_new(self, inputs)
    148 

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in _prepare_for_training(self, job_name)
    120 
    121         self.static_hyperparameters = {to_str(k): to_str(v) for (k, v) in self.estimator.hyperparameters().items()}
--> 122         for hyperparameter_name in self._hyperparameter_ranges.keys():
    123             self.static_hyperparameters.pop(hyperparameter_name, None)
    124 

AttributeError: 'list' object has no attribute 'keys'                           

1 个答案:

答案 0 :(得分:3)

初始化HyperparameterTuner对象时,您的参数顺序错误。构造函数具有以下签名:

HyperparameterTuner(estimator, 
                    objective_metric_name, 
                    hyperparameter_ranges, 
                    metric_definitions=None, 
                    strategy='Bayesian', 
                    objective_type='Maximize', 
                    max_jobs=1, 
                    max_parallel_jobs=1, 
                    tags=None, 
                    base_tuning_job_name=None)

因此,在这种情况下,您的objective_type位置错误。有关更多详细信息,请参见the docs