R中超参数的贝叶斯优化

时间:2019-10-25 15:56:47

标签: r machine-learning data-science xgboost

我一直在研究用于超参数调整的贝叶斯优化,并试图将获得的结果与使用不同方法(随机网格搜索)获得的结果进行比较。

我碰到了这个网站,作者在其中使用mlrMBO软件包来最大化对数可能性(请参见示例2):https://www.simoncoulombe.com/2019/01/bayesian/。我有一个不同的场景,我想最小化对数损失,因此在定义目标函数时我对作者的代码做了一些小的调整,但是我不确定它是否正确。他的目标函数返回了通过交叉验证获得的测试对数似然的最大值,并且minimize库中makeSingleObjectiveFunction函数中的smoof参数设置为FALSE 。由于我想最小化对数丢失,因此我从交叉验证中返回了对数丢失的最小值,并将minimize参数设置为TRUE。因为这是我第一次尝试使用该软件包,并且通常对机器学习不太了解,所以我不确定我的代码是否正确。任何见解将不胜感激!


obj.fun  <- makeSingleObjectiveFunction(
  name = "xgb_cv_bayes",
  fn = function(x){
    set.seed(12345)
    cv <- xgb.cv(params = list(
      booster          = "gbtree",
      eta              = x["eta"],
      max_depth        = x["max_depth"],
      min_child_weight = x["min_child_weight"],
      gamma            = x["gamma"],
      subsample        = x["subsample"],
      colsample_bytree = x["colsample_bytree"],
      objective        = "binary:logistic", 
      eval_metric     = "logloss"),
      data = dtrain,
      nrounds = x["nrounds"],
      folds =  cv_folds,
      prediction = FALSE,
      showsd = TRUE,
      early_stopping_rounds = 10,
      verbose = 0)

    cv$evaluation_log[, min(test_logloss_mean)]
  },
  par.set = makeParamSet(
    makeNumericParam("eta",              lower = 0.1, upper = 0.5),
    makeNumericParam("gamma",            lower = 0,   upper = 5),
    makeIntegerParam("max_depth",        lower = 3,   upper = 6),
    makeIntegerParam("min_child_weight", lower= 1,    upper = 2),
    makeNumericParam("subsample",        lower = 0.6, upper = 0.8),
    makeNumericParam("colsample_bytree", lower = 0.5, upper = 0.7),
    makeIntegerParam("nrounds", lower = 100, upper = 1000)
  ),
  minimize = TRUE
)

0 个答案:

没有答案