我改写了我的问题:
我正在使用R与插入包。
我将数据集分为3个部分:培训,验证和测试集。 验证集将用于调整训练参数。
插入符号的列车功能,通过重新采样(默认情况下自举)调整训练参数。 有没有办法告诉列车功能使用我的验证数据集参数而不是重新采样?
现在我必须使用一个循环,你可以在下面的例子中看到。
EG: 代码:
library(caret)
set.seed(3)
data("segmentationData")
#
# Partition data set in training, validation, testing.
#
inTraining <- createDataPartition(segmentationData$Class, p=.60, list=FALSE)
training <- segmentationData[ inTraining,]
notTraining <- segmentationData[-inTraining,]
inValidation <- createDataPartition(notTraining$Class, p=.50, list=FALSE)
validation <- notTraining[inValidation,]
testing <- notTraining[-inValidation,]
#
# The model will be trained using method 'rpart',
# it has cp (Complexity Parameter) as only tuning parameter.
#
# The training will be tuned using different values for cp.
# We'll choose the cp that maximizes accuracy.
#
cps = c(0, 0.001, 0.003, 0.01, 0.03)
maxAccuracy = -1
for(currentCp in cps) {
#
# Call train function using currentCp and train control set to 'none'.
#
f <- train(Class~., training, method = 'rpart',
trControl = trainControl(method = "none"),
tuneGrid = data.frame( cp = currentCp ))
#
# Predict on validation data set.
#
pr <- predict(f, validation)
#
# Select cp that maximizes accuracy.
#
cm=confusionMatrix(pr, validation$Class)
currentAccuracy = cm$overall[[1]]
if(currentAccuracy > maxAccuracy) {
cpMaxAccuracy = currentCp
maxAccuracy = currentAccuracy
}
}
#
# Output.
#
cpMaxAccuracy
maxAccuracy