我一直在研究用于超参数调整的贝叶斯优化,并试图将获得的结果与使用不同方法(随机网格搜索)获得的结果进行比较。
我碰到了这个网站,作者在其中使用mlrMBO
软件包来最大化对数可能性(请参见示例2):https://www.simoncoulombe.com/2019/01/bayesian/。我有一个不同的场景,我想最小化对数损失,因此在定义目标函数时我对作者的代码做了一些小的调整,但是我不确定它是否正确。他的目标函数返回了通过交叉验证获得的测试对数似然的最大值,并且minimize
库中makeSingleObjectiveFunction
函数中的smoof
参数设置为FALSE
。由于我想最小化对数丢失,因此我从交叉验证中返回了对数丢失的最小值,并将minimize
参数设置为TRUE
。因为这是我第一次尝试使用该软件包,并且通常对机器学习不太了解,所以我不确定我的代码是否正确。任何见解将不胜感激!
obj.fun <- makeSingleObjectiveFunction(
name = "xgb_cv_bayes",
fn = function(x){
set.seed(12345)
cv <- xgb.cv(params = list(
booster = "gbtree",
eta = x["eta"],
max_depth = x["max_depth"],
min_child_weight = x["min_child_weight"],
gamma = x["gamma"],
subsample = x["subsample"],
colsample_bytree = x["colsample_bytree"],
objective = "binary:logistic",
eval_metric = "logloss"),
data = dtrain,
nrounds = x["nrounds"],
folds = cv_folds,
prediction = FALSE,
showsd = TRUE,
early_stopping_rounds = 10,
verbose = 0)
cv$evaluation_log[, min(test_logloss_mean)]
},
par.set = makeParamSet(
makeNumericParam("eta", lower = 0.1, upper = 0.5),
makeNumericParam("gamma", lower = 0, upper = 5),
makeIntegerParam("max_depth", lower = 3, upper = 6),
makeIntegerParam("min_child_weight", lower= 1, upper = 2),
makeNumericParam("subsample", lower = 0.6, upper = 0.8),
makeNumericParam("colsample_bytree", lower = 0.5, upper = 0.7),
makeIntegerParam("nrounds", lower = 100, upper = 1000)
),
minimize = TRUE
)