应用错误收集

R中随机森林超参数调整的相同结果

时间：2019-06-11 14:01:49

标签： r machine-learning random-forest

我正在尝试使用库（RandomForest）对R中的randomforest进行超参数调整。为什么对于不同的超参数值（即maxnodes），我得到相同的结果？

我正在使用从Kaggle获得的Titanic数据集。我仅对超参数“ mtry”应用了网格搜索，结果为每个“ mtry”提供了不同的Accuracy和Kappa值。

但是，当我尝试搜索最佳的“ maxnode”值时，它们都返回相同的Accuracy和Kappa值。

这是用于调优“ mtry”

library(randomForest)
control <- trainControl(
  method = "cv", 
  number = 10,
  search = "grid"
)

tuneGrid <- expand.grid(.mtry = c(1:10))
rf_mtry <- train(train_X,
                 train_Y,
                 method = "rf",
                 metric = "Accuracy",
                 tuneGrid = tuneGrid,
                 trControl = control,
                 importance = TRUE,
                 nodesize = 14,
                 ntree = 300
                 )

这是用于调整“ maxnodes”

mtry_best <- rf_mtry$bestTune$mtry

store_maxnode <- list()
tuneGrid <- expand.grid(.mtry = mtry_best)
for (maxnodes in c(2:20)) {
  set.seed(1234)
  rf_maxnode <- train(train_X,
                      train_Y,
                      method = "rf",
                      metric = "Accuracy",
                      tuneGrid = tuneGrid,
                      trControl = control,
                      importance = TRUE,
                      nodesize = 5,
                      maxnodes = maxnodes,
                      ntree = 300
                      )
  current_iteration <- toString(maxnodes)
  store_maxnode[[current_iteration]] <- rf_maxnode
}
results_mtry <- resamples(store_maxnode)
summary(results_mtry)

我希望对于不同的maxnode会看到不同的Accuracy和Kappa值，但是它们是相同的。

0 个答案:

没有答案