r - 结果的贝叶斯优化

时间:2017-07-21 13:37:11

标签: r

我有一个贝叶斯优化代码,它使用Value和所选参数打印结果。我的问题是 - 如何选择最好的组合?我案例中的最小RMSE值在不同轮次中较低?

代码:

library(xgboost)
library(rBayesianOptimization)


data(agaricus.train, package='xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)

cv_folds <- KFold(
                  y
                  , nfolds = 5
                  , stratified = TRUE
                  , seed = 5000)

xgb_cv_bayes <- function(eta, max.depth, min_child_weight, subsample,colsample_bytree ) {
  cv <- xgb.cv(params = list(booster = "gbtree"
                            # , eta = 0.01
                             , eta = eta
                             , max_depth = max.depth
                             , min_child_weight = min_child_weight
                             , colsample_bytree = colsample_bytree
                             , subsample = subsample
                            #, colsample_bytree = 0.3
                             , lambda = 1
                             , alpha = 0
                             , objective = "reg:linear"
                             , eval_metric = "rmse")
               , data = dtrain
               , nround = 1000
               , folds = cv_folds
               , prediction = TRUE
               , showsd = TRUE
               , early_stopping_rounds = 10
               , maximize = TRUE
               , verbose = 0
               , finalize = TRUE)
  list(Score = cv$evaluation_log[,min(test_rmse_mean)]
       ,Pred = cv$pred
       , cb.print.evaluation(period = 1))
}

cat("Calculating Bayesian Optimum Parameters\n")

OPT_Res <- BayesianOptimization(xgb_cv_bayes
                                , bounds = list(
                                  eta = c(0.001, 0.03)
                                , max.depth = c(3L, 10L)
                                , min_child_weight = c(3L, 10L)
                                , subsample = c(0.8, 1)
                                , colsample_bytree = c(0.5, 1))

                                , init_grid_dt = NULL
                                , init_points = 10
                                , n_iter = 200
                                , acq = "ucb"
                                , kappa = 3
                                , eps = 1.5
                                , verbose = TRUE)

1 个答案:

答案 0 :(得分:3)

来自help(BayesianOptimization),参数FUN

  

最大化的功能。此函数应返回命名列表   有2个组件。第一个组成部分“得分”应该是指标   最大化,第二个组成部分“Pred”应该是   集成/堆叠的验证/交叉验证预测。

您的函数返回Score = cv$evaluation_log[,min(test_rmse_mean)]。您希望最小化此值,而不是最大化它。尝试返回负数,这样当返回的值最大化时,您将最小化RMSE。 Score = -cv$evaluation_log[,min(test_rmse_mean)]