我希望获得一些帮助,以找出能够使用我的测试数据得出混淆矩阵的地方。
我正在使用本教程https://rstudio-pubs-static.s3.amazonaws.com/374759_30994015da0a4efdb249142ebfb1d0cd.html,我想知道如何完成建模过程并获得测试数据的混淆矩阵。
图书馆(插入符)
data <- iris
gbmTrain <- data[sample(nrow(data), round(nrow(data)*0.9), replace = F),]
gbmTest <- data[-sample(nrow(data), round(nrow(data)*0.9), replace = F),] # I #created the test data set.
grid <- expand.grid(n.trees = c(1000,1500), interaction.depth=c(1:3), shrinkage=c(0.01,0.05,0.1), n.minobsinnode=c(20))
ctrl <- trainControl(method = "repeatedcv",number = 5, repeats = 2, allowParallel = T)
#Register parallel cores
registerDoParallel(detectCores()-1)
#build model
set.seed(124) #for reproducability
unwantedoutput <- capture.output(GBMModel <- train(Species~.,data = gbmTrain,
method = "gbm", trControl = ctrl, tuneGrid = grid))
print(GBMModel)
confusionMatrix(GBMModel) ## this one works
##When I want predict my test data using:
GBMModel$bestTune
myGrid <- GBMModel$bestTune
GBM.final <- train(Species~., data = gbmTrain, method = "gbm", trControl = ctrl, tuneGrid = myGrid)
prediction <- predict.train(object = GBM.final, newdata = gbmTest, type = 'prob')
##I get a prediction, but when I want to get the confusionMatrix
confusionMatrix(prediction)
# Error in is.factor(reference) :
# argument "reference" is missing, with no default
##I get an error of if I try to run the model with train() and the test data it also gives me an error.
## I would like a result like this table:
## Kappa = 0.35
## Reference
## Prediction setosa versicolor virginica
## setosa 32.6 0.0 0.0
## versicolor 0.0 31.5 1.9
## virginica 0.0 3.3 30.7
##
## Accuracy (average) : 0.9481