我有一个贝叶斯优化代码,它使用Value和所选参数打印结果。我的问题是 - 如何选择最好的组合?我案例中的最小RMSE值在不同轮次中较低?
代码:
library(xgboost)
library(rBayesianOptimization)
data(agaricus.train, package='xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
cv_folds <- KFold(
y
, nfolds = 5
, stratified = TRUE
, seed = 5000)
xgb_cv_bayes <- function(eta, max.depth, min_child_weight, subsample,colsample_bytree ) {
cv <- xgb.cv(params = list(booster = "gbtree"
# , eta = 0.01
, eta = eta
, max_depth = max.depth
, min_child_weight = min_child_weight
, colsample_bytree = colsample_bytree
, subsample = subsample
#, colsample_bytree = 0.3
, lambda = 1
, alpha = 0
, objective = "reg:linear"
, eval_metric = "rmse")
, data = dtrain
, nround = 1000
, folds = cv_folds
, prediction = TRUE
, showsd = TRUE
, early_stopping_rounds = 10
, maximize = TRUE
, verbose = 0
, finalize = TRUE)
list(Score = cv$evaluation_log[,min(test_rmse_mean)]
,Pred = cv$pred
, cb.print.evaluation(period = 1))
}
cat("Calculating Bayesian Optimum Parameters\n")
OPT_Res <- BayesianOptimization(xgb_cv_bayes
, bounds = list(
eta = c(0.001, 0.03)
, max.depth = c(3L, 10L)
, min_child_weight = c(3L, 10L)
, subsample = c(0.8, 1)
, colsample_bytree = c(0.5, 1))
, init_grid_dt = NULL
, init_points = 10
, n_iter = 200
, acq = "ucb"
, kappa = 3
, eps = 1.5
, verbose = TRUE)
答案 0 :(得分:3)
来自help(BayesianOptimization)
,参数FUN
:
最大化的功能。此函数应返回命名列表 有2个组件。第一个组成部分“得分”应该是指标 最大化,第二个组成部分“Pred”应该是 集成/堆叠的验证/交叉验证预测。
您的函数返回Score = cv$evaluation_log[,min(test_rmse_mean)]
。您希望最小化此值,而不是最大化它。尝试返回负数,这样当返回的值最大化时,您将最小化RMSE。 Score = -cv$evaluation_log[,min(test_rmse_mean)]