我正在尝试使用库(RandomForest)对R中的randomforest进行超参数调整。为什么对于不同的超参数值(即maxnodes),我得到相同的结果?
我正在使用从Kaggle获得的Titanic数据集。我仅对超参数“ mtry”应用了网格搜索,结果为每个“ mtry”提供了不同的Accuracy和Kappa值。
但是,当我尝试搜索最佳的“ maxnode”值时,它们都返回相同的Accuracy和Kappa值。
library(randomForest)
control <- trainControl(
method = "cv",
number = 10,
search = "grid"
)
tuneGrid <- expand.grid(.mtry = c(1:10))
rf_mtry <- train(train_X,
train_Y,
method = "rf",
metric = "Accuracy",
tuneGrid = tuneGrid,
trControl = control,
importance = TRUE,
nodesize = 14,
ntree = 300
)
mtry_best <- rf_mtry$bestTune$mtry
store_maxnode <- list()
tuneGrid <- expand.grid(.mtry = mtry_best)
for (maxnodes in c(2:20)) {
set.seed(1234)
rf_maxnode <- train(train_X,
train_Y,
method = "rf",
metric = "Accuracy",
tuneGrid = tuneGrid,
trControl = control,
importance = TRUE,
nodesize = 5,
maxnodes = maxnodes,
ntree = 300
)
current_iteration <- toString(maxnodes)
store_maxnode[[current_iteration]] <- rf_maxnode
}
results_mtry <- resamples(store_maxnode)
summary(results_mtry)
我希望对于不同的maxnode会看到不同的Accuracy和Kappa值,但是它们是相同的。