h2o-深度学习笛卡尔网格搜索-相同的训练和验证-相同的超参数生成不同的模型

时间:2020-07-02 19:11:49

标签: h2o

我是AI / ML的新手,最近发现了h2o。

由于要运行所有不同的组合,因此我正在尝试使用笛卡尔搜索进行网格搜索以进行深度学习。我使用相同的训练和验证文件以及相同的超级搜索参数集和grid.train参数进行了两次跑步。两次运行都生成相同数量的模型,并且每个模型都使用相同的输入参数“激活的自适应率epsilon隐藏的hidden_​​dropout_ratios input_dropout_ratio rho”生成。

我的观察是,对于使用相同输入参数的每次运行,生成的模型具有不同的对数损失,平均每类错误,MSE,RMSE等。 为了减少进一步的用户错误,我仅将网格搜索限制为仅一组参数。我的发现在下面,其中包含详细的日志等。

我的问题是在给定相同参数集和训练/验证框架的情况下,如何保证生成的模型完全相同。

培训和验证文件格式和数据


BPS1,BPS2,ZSRTN,PCNT_RTN,PCNT_RTN100,Open,High,Low,Close,Time
58,18  ,  3.00  , -0.12  , -12  , 297.2700  , 297.3100  , 297.0800  , 297.1700  , 201907050935
18,20  ,  3.00  , -0.11  , -11  , 297.1800  , 297.1900  , 296.9300  , 296.9400  , 201907050940
20,20  ,  5.00  ,  0.01  , 1  , 296.9400  , 297.2600  , 296.8200  , 297.2150  , 201907050945
20,30  ,  5.00  ,  0.03  , 3  , 297.2200  , 297.2600  , 297.0400  , 297.0400  , 201907050950

activation = RectifierWithDropout
adaptive_rate = true
epsilon = 1.0E-6
hidden = [200]
hidden_dropout_ratios = [0.1]
input_dropout_ratio 0.05
rho = 0.9

Python代码


    hyper_parameters = {
            "hidden": [[200]],
            "epsilon" : 1.0E-6,
            "adaptive_rate": True,
            "activation": ["RectifierWithDropout"],
            "input_dropout_ratio" : [0.05],
            "hidden_dropout_ratios" : [0.1],
            "rho":[0.9]
            }



.....

    search_criteria = {"strategy": "Cartesian"}

.....

    model_grid = H2OGridSearch(model = H2ODeepLearningEstimator,
        grid_id = project_name,
        hyper_params=hyper_parameters,
        search_criteria=search_criteria)



    model_grid.train(x=x,
        y = response_column,
        distribution=default_distribution, epochs=10000,
        training_frame=train, validation_frame=test,
        score_interval=0, stopping_rounds=5,
        stopping_tolerance=1e-3,
        stopping_metric="mean_per_class_error")


准备首次运行



07-02 09:27:45.989 192.168.123.5:54321   #7248  #75857-26 INFO: Starting gridsearch: estimated size of search space = 1
07-02 09:27:45.990 192.168.123.5:54321   #7248  FJ-1-51   INFO: Due to the grid time limit, changing model max runtime to: 1.7976931348623157E308 secs.
07-02 09:27:45.992 192.168.123.5:54321   #7248  FJ-1-51   INFO: Building H2O DeepLearning model with these parameters:
07-02 09:27:45.992 192.168.123.5:54321   #7248  FJ-1-51   INFO: {"_train":{"name":"py_1_sid_b81b","type":"Key"},"_valid":{"name":"py_2_sid_b81b","type":"Key"},"_nfolds":0,"_keep_cross_validation_models":true,"_keep_cross_validation_predictions":false,"_keep_cross_validation_fold_assignment":false,"_parallelize_cross_validation":true,"_auto_rebalance":true,"_seed":-1,"_fold_assignment":"AUTO","_categorical_encoding":"AUTO","_max_categorical_levels":10,"_distribution":"AUTO","_tweedie_power":1.5,"_quantile_alpha":0.5,"_huber_alpha":0.9,"_ignored_columns":["Close","PCNT_RTN100","High","Low","PCNT_RTN","Time","Open"],"_ignore_const_cols":true,"_weights_column":null,"_offset_column":null,"_fold_column":null,"_check_constant_response":true,"_is_cv_model":false,"_score_each_iteration":false,"_max_runtime_secs":1.7976931348623157E308,"_stopping_rounds":5,"_stopping_metric":"mean_per_class_error","_stopping_tolerance":0.001,"_response_column":"ZSRTN","_balance_classes":false,"_max_after_balance_size":5.0,"_class_sampling_factors":null,"_max_confusion_matrix_size":20,"_checkpoint":null,"_pretrained_autoencoder":null,"_custom_metric_func":null,"_custom_distribution_func":null,"_export_checkpoints_dir":null,"_overwrite_with_best_model":true,"_autoencoder":false,"_use_all_factor_levels":true,"_standardize":true,"_activation":"RectifierWithDropout","_hidden":[200],"_epochs":10000.0,"_train_samples_per_iteration":-2,"_target_ratio_comm_to_comp":0.05,"_adaptive_rate":true,"_rho":0.9,"_epsilon":1.0E-6,"_rate":0.005,"_rate_annealing":1.0E-6,"_rate_decay":1.0,"_momentum_start":0.0,"_momentum_ramp":1000000.0,"_momentum_stable":0.0,"_nesterov_accelerated_gradient":true,"_input_dropout_ratio":0.05,"_hidden_dropout_ratios":[0.1],"_l1":0.0,"_l2":0.0,"_max_w2":3.4028235E38,"_initial_weight_distribution":"UniformAdaptive","_initial_weight_scale":1.0,"_initial_weights":null,"_initial_biases":null,"_loss":"Automatic","_score_interval":0.0,"_score_training_samples":10000,"_score_validation_samples":0,"_score_duty_cycle":0.1,"_classification_stop":0.0,"_regression_stop":1.0E-6,"_quiet_mode":false,"_score_validation_sampling":"Uniform","_diagnostics":true,"_variable_importances":true,"_fast_mode":true,"_force_load_balance":true,"_replicate_training_data":true,"_single_node_mode":false,"_shuffle_training_data":false,"_missing_values_handling":"MeanImputation","_sparse":false,"_col_major":false,"_average_activation":0.0,"_sparsity_beta":0.0,"_max_categorical_features":2147483647,"_reproducible":false,"_export_weights_and_biases":false,"_elastic_averaging":false,"_elastic_averaging_moving_rate":0.9,"_elastic_averaging_regularization":0.001,"_mini_batch_size":1}
07-02 09:27:45.992 192.168.123.5:54321   #7248  FJ-1-51   INFO: Dropping ignored columns: [Close, PCNT_RTN100, High, Low, PCNT_RTN, Time, Open]
07-02 09:27:45.992 192.168.123.5:54321   #7248  FJ-1-51   INFO: Dataset already contains 128 chunks. No need to rebalance.
07-02 09:27:45.993 192.168.123.5:54321   #7248  FJ-1-51   INFO: Starting model DeepLearning__gen_202007020927_m_5_r_2_b_2_pp_0.05_l_1_t_10_model_1

第一次运行的结果

07-02 09:27:51.513 192.168.123.5:54321   #7248  #75857-30 INFO: Hyper-Parameter Search Summary (ordered by increasing logloss):
07-02 09:27:51.513 192.168.123.5:54321   #7248  #75857-30 INFO:             activation  adaptive_rate  epsilon  hidden  hidden_dropout_ratios  input_dropout_ratio  rho                                                            model_ids             logloss
07-02 09:27:51.513 192.168.123.5:54321   #7248  #75857-30 INFO:   RectifierWithDropout           true   1.0E-6   [200]                  [0.1]                 0.05  0.9  DeepLearning__gen_202007020927_m_5_r_2_b_2_pp_0.05_l_1_t_10_model_1  1.7762588168309075

为第二次运行做准备


07-02 09:32:49.293 192.168.123.5:54321   #7248  #75857-29 INFO: Starting gridsearch: estimated size of search space = 1
07-02 09:32:49.293 192.168.123.5:54321   #7248  FJ-1-25   INFO: Due to the grid time limit, changing model max runtime to: 1.7976931348623157E308 secs.
07-02 09:32:49.294 192.168.123.5:54321   #7248  FJ-1-25   INFO: Building H2O DeepLearning model with these parameters:
07-02 09:32:49.294 192.168.123.5:54321   #7248  FJ-1-25   INFO: {"_train":{"name":"py_1_sid_aeed","type":"Key"},"_valid":{"name":"py_2_sid_aeed","type":"Key"},"_nfolds":0,"_keep_cross_validation_models":true,"_keep_cross_validation_predictions":false,"_keep_cross_validation_fold_assignment":false,"_parallelize_cross_validation":true,"_auto_rebalance":true,"_seed":-1,"_fold_assignment":"AUTO","_categorical_encoding":"AUTO","_max_categorical_levels":10,"_distribution":"AUTO","_tweedie_power":1.5,"_quantile_alpha":0.5,"_huber_alpha":0.9,"_ignored_columns":["Time","Open","PCNT_RTN","PCNT_RTN100","Low","Close","High"],"_ignore_const_cols":true,"_weights_column":null,"_offset_column":null,"_fold_column":null,"_check_constant_response":true,"_is_cv_model":false,"_score_each_iteration":false,"_max_runtime_secs":1.7976931348623157E308,"_stopping_rounds":5,"_stopping_metric":"mean_per_class_error","_stopping_tolerance":0.001,"_response_column":"ZSRTN","_balance_classes":false,"_max_after_balance_size":5.0,"_class_sampling_factors":null,"_max_confusion_matrix_size":20,"_checkpoint":null,"_pretrained_autoencoder":null,"_custom_metric_func":null,"_custom_distribution_func":null,"_export_checkpoints_dir":null,"_overwrite_with_best_model":true,"_autoencoder":false,"_use_all_factor_levels":true,"_standardize":true,"_activation":"RectifierWithDropout","_hidden":[200],"_epochs":10000.0,"_train_samples_per_iteration":-2,"_target_ratio_comm_to_comp":0.05,"_adaptive_rate":true,"_rho":0.9,"_epsilon":1.0E-6,"_rate":0.005,"_rate_annealing":1.0E-6,"_rate_decay":1.0,"_momentum_start":0.0,"_momentum_ramp":1000000.0,"_momentum_stable":0.0,"_nesterov_accelerated_gradient":true,"_input_dropout_ratio":0.05,"_hidden_dropout_ratios":[0.1],"_l1":0.0,"_l2":0.0,"_max_w2":3.4028235E38,"_initial_weight_distribution":"UniformAdaptive","_initial_weight_scale":1.0,"_initial_weights":null,"_initial_biases":null,"_loss":"Automatic","_score_interval":0.0,"_score_training_samples":10000,"_score_validation_samples":0,"_score_duty_cycle":0.1,"_classification_stop":0.0,"_regression_stop":1.0E-6,"_quiet_mode":false,"_score_validation_sampling":"Uniform","_diagnostics":true,"_variable_importances":true,"_fast_mode":true,"_force_load_balance":true,"_replicate_training_data":true,"_single_node_mode":false,"_shuffle_training_data":false,"_missing_values_handling":"MeanImputation","_sparse":false,"_col_major":false,"_average_activation":0.0,"_sparsity_beta":0.0,"_max_categorical_features":2147483647,"_reproducible":false,"_export_weights_and_biases":false,"_elastic_averaging":false,"_elastic_averaging_moving_rate":0.9,"_elastic_averaging_regularization":0.001,"_mini_batch_size":1}
07-02 09:32:49.295 192.168.123.5:54321   #7248  FJ-1-25   INFO: Dropping ignored columns: [Time, Open, PCNT_RTN, PCNT_RTN100, Low, Close, High]
07-02 09:32:49.295 192.168.123.5:54321   #7248  FJ-1-25   INFO: Dataset already contains 128 chunks. No need to rebalance.
07-02 09:32:49.295 192.168.123.5:54321   #7248  FJ-1-25   INFO: Starting model DeepLearning__gen_202007020932_m_5_r_2_b_2_pp_0.05_l_1_t_10_model_1

第二次运行的结果

07-02 09:32:53.914 192.168.123.5:54321   #7248  #75857-32 INFO: Hyper-Parameter Search Summary (ordered by increasing logloss):
07-02 09:32:53.914 192.168.123.5:54321   #7248  #75857-32 INFO:             activation  adaptive_rate  epsilon  hidden  hidden_dropout_ratios  input_dropout_ratio  rho                                                            model_ids             logloss
07-02 09:32:53.914 192.168.123.5:54321   #7248  #75857-32 INFO:   RectifierWithDropout           true   1.0E-6   [200]                  [0.1]                 0.05  0.9  DeepLearning__gen_202007020932_m_5_r_2_b_2_pp_0.05_l_1_t_10_model_1  1.7002255980952898

0 个答案:

没有答案