XGBoost模型对验证数据的性能报告

时间:2016-10-10 10:33:38

标签: r logging xgboost

我想利用XGBoost的early.stop.round功能来进行非过度训练。为此,我使用以下代码:

param2 <- list("objective" = "reg:linear",
                     "eval_metric" = "rmse",
                     "max_depth" = 15,
                     "eta" = 0.03,
                     "gamma" = 0,
                     "subsample" = 0.5,
                    "colsample_bytree" = 0.6,
                     "min_child_weight" = 5,
                     "alpha" = 0.15)

  watchlist <- list(train = xgb.DMatrix(data = train_matrix, label = output_train),
                  test = xgb.DMatrix(data = total_matrix[ind, ], label = as.matrix(output_total[ind, ])))

  bst <- xgboost(data=train_matrix, label=output_train, nrounds = 500, watchlist = watchlist,
                        early.stop.round=5,verbose = 2, param=param2, missing = NaN)

根据需要,我为监视列表创建traintest xgb.DMatrix并将其传递给xgboost()。我确保verbose可以打印中间结果。但是使用verbose=2我得到的日志如下:

tree prunning end, 1 roots, 1692 extra nodes, 0 pruned nodes ,max_depth=15
[74]    train-rmse:0.129515
tree prunning end, 1 roots, 1874 extra nodes, 0 pruned nodes ,max_depth=15
[75]    train-rmse:0.128455
tree prunning end, 1 roots, 1826 extra nodes, 0 pruned nodes ,max_depth=15
[76]    train-rmse:0.127804
tree prunning end, 1 roots, 1462 extra nodes, 0 pruned nodes ,max_depth=15
[77]    train-rmse:0.126874
tree prunning end, 1 roots, 1848 extra nodes, 0 pruned nodes ,max_depth=15
[78]    train-rmse:0.125914

verbose=1给了我:

[74]    train-rmse:0.129515
[75]    train-rmse:0.128455
[76]    train-rmse:0.127804
[77]    train-rmse:0.126874
[78]    train-rmse:0.125914

但是这些都没有给我测试DMatrix每个步骤的模型性能。我也尝试过没有成功:

  1. verbose=Tverbose=F
  2. test DMatrix的名称更改为validation
  3. 我缺少什么来获得所需的输出。

1 个答案:

答案 0 :(得分:0)

显然,测试数据集性能报告只能使用xgboost()而不是param来完成。相关的修改代码(不是复制上面的 dtrain <- xgb.DMatrix(data = train_matrix, label = output_train) dtest <- xgb.DMatrix(data = total_matrix[ind, ], label = as.matrix(output_total[ind, ])) watchlist <- list(train = dtrain, test = dtest) bst <- xgb.train(data= dtrain, nrounds = 500, watchlist = watchlist, prediction = T, early.stop.round=5,verbose = 1, param=param2, missing = NaN) 部分)看起来像:

DELETE FROM alembic_version WHERE version_num='3aae6532b560';
INSERT INTO alembic_version VALUES ('3aae6532b560');