我有一个如下模型:
library(mlbench)
data(Sonar)
library(caret)
set.seed(998)
my_data <- Sonar
fitControl <-
trainControl(
method = "cv",
number = 10,
classProbs = T,
savePredictions = T,
summaryFunction = twoClassSummary
)
model <- train(
Class ~ .,
data = my_data,
method = "xgbTree",
trControl = fitControl,
metric = "ROC"
)
但是,使用10倍交叉验证,它会根据样本在训练数据中出现的行保持所选的折叠不变。
如何让它随机选择10%的训练数据作为交叉验证的重新取样?我相信这被称为蒙特卡罗交叉验证。
谢谢!
答案 0 :(得分:3)
您应该使用:
fitControl <-
trainControl(
method = "LGOCV",
p = 10,
classProbs = T,
savePredictions = T,
summaryFunction = twoClassSummary
)
蒙特卡罗交叉验证也称为离开组交叉验证(LGOCV)
&#34; P&#34;是培训百分比
更多信息https://stats.stackexchange.com/questions/51416/k-fold-vs-monte-carlo-cross-validation
和http://appliedpredictivemodeling.com/blog/2014/11/27/vpuig01pqbklmi72b8lcl3ij5hj2qm
答案 1 :(得分:0)
我相信使用repeats=
和参数fitControl <-
trainControl(
method = "repeatedcv",
number = 10,
repeats = 5,
classProbs = T,
savePredictions = T,
summaryFunction = twoClassSummary
)
会重新对交叉验证进行重采样。
{{1}}