[enter link description here][1]
最近,我编写了一个脚本来训练随机森林模型到分类器土地使用/覆盖类型使用RI中的randomForest包将获得不同的整体准确性和kappa统计数据,当我运行脚本10次。现在,我想要使用K-fold交叉验证重新训练我的模型,但我不知道如何做到这一点以及如何找到最佳模型?如果我使用K折叠交叉验证重新训练我的模型,我如何获得平均整体准确度和kappa统计数据?
有没有人有一些经验或一些有用的例子?非常感谢。非常感谢你。
我的代码如下:
cat("Calculating random forest object\n")
randfor <-randomForest(as.factor(response)~.,data=trainvals,importance=TRUE, na.action=na.omit,proximity=TRUE)
#try to print randomForest model and see the important features
print(randfor)
#Try to see the margin, positive or negative, if positive it means
#correct classification
rf.margin <- margin(randfor,responseTest)
plot(rf.margin)
#display the error rates of a randforForest
plot(randfor)
#Predict the land cover type of the test datasets
pred <- predict(randfor,newdata = trainvalsTest)
#generate a classification table for the testing datasets
rf.table <- table(pred,responseTest)
rf.table
# Plotting variable importance plot
varImpPlot(randfor)
classAgreement(rf.table)
#Print the value of overall accuracy and Kappa Statistic
confusion <- confusionMatrix(pred,responseTest)
confusion
#print the importance of all the input variables
randomForest.importance <- importance(randfor)
randomForest.importance
#using caret package to calculate the variable importance
caret.importance <- varImp(randfor,scale = FALSE)
#print the overalll value of the input variables
print(caret.importance)
#display the variable importance plot
plot(caret.importance)