Question

我正在R中工作，我正在尝试为我想运行的xgboost模型确定最佳的超参数。我有一个包含约700个变量（一些数字，另一些是热编码的）和约25,000个观测值的数据集。我正在尝试预测每个观察值是大（预测= 1）还是小（预测= 0）。问题是，当我运行xgb.cv函数时，train-error和test-error在每次迭代后都不会改变。下面是我的代码和随后的打印输出。谁能解释为什么错误保持不变？非常感谢！

特定的R代码：

dtrain <- xgb.DMatrix(data = pred[train,], label = resp[train])
xgb.cv(data = dtrain,
              params = list(objective = "binary:logistic",
                            eta = 0.01,
                            max_depth = 10,
                            min_child_weight = 20,
                            colsample_bytree = 0.2),  
              nfold = 5,
              nrounds = 100,
              verbose = TRUE,
              early_stopping_rounds = 8,
              maximize = FALSE)

控制台打印输出：

[1] train-error:0.014422+0.000491   test-error:0.014422+0.001965
Multiple eval metrics are present. Will use test_error for early stopping.
Will train until test_error hasn't improved in 8 rounds.

[2] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[3] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[4] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[5] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[6] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[7] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[8] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
[9] train-error:0.014422+0.000491   test-error:0.014422+0.001965 
Stopping. Best iteration:
[1] train-error:0.014422+0.000491   test-error:0.014422+0.001965

再次感谢您的帮助！

编辑/更新-我尝试了以下代码，输出如下：

使用多个参数的新代码：

param <- list(objective = "binary:logistic",
          eta = c(0.01, 0.05, 0.1, 0.5, 1),
          max_depth = 10,
          min_child_weight = 20,
          colsample_bytree = c(0.1, 0.2, 0.5, 1))
cv <- xgb.cv(data = dtrain,
         params = param,
         nfold = 5,
         nrounds = 100,
         verbose = TRUE,
         early_stopping_rounds = 8,
         maximize = FALSE)

控制台输出：

[1] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
Multiple eval metrics are present. Will use test_error for early stopping.
Will train until test_error hasn't improved in 8 rounds.

[2] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[3] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[4] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[5] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[6] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[7] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[8] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
[9] train-error:0.014422+0.000189   test-error:0.014422+0.000756 
Stopping. Best iteration:
[1] train-error:0.014422+0.000189   test-error:0.014422+0.000756

R-xgb.cv测试/训练错误不会每次迭代都更改

0 个答案: