我一直使用插入符号来使用"rpart2"方法构建决策树模型,并在R中重复10次交叉验证。
set.seed(888)
DT_up_rpart <-train(DiagDM1~., data = training, method = "rpart2", trControl = ctrl2,metric = "ROC", tuneGrid = maxdepthGrid)
从这个电话中我得到了这个结果:
## CART
## 56662 samples
## 11 predictor
## 2 classes:'No','Si'
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 5 times)
## Summary of sample sizes: 50995, 50995, 50996, 50996, 50996, 50996, ...
## Resampling results across tuning parameters:
##
## maxdepth ROC Sens Spec Accuracy Kappa
## 3 0.7378576 0.7408495 0.6449825 0.6929158 0.3858318
## 4 0.7382515 0.7367128 0.6502273 0.6934699 0.3869400
## 5 0.7382515 0.7367128 0.6502273 0.6934699 0.3869400
## 6 0.7382515 0.7367128 0.6502273 0.6934699 0.3869400
##
## ROC was used to select the optimal model using the largest value.
## The final value used for the model was maxdepth = 4
正如您所看到的,模型的最终值是maxdepth = 4.因此,现在使用相同的训练数据集,相同的种子,相同的性能测量等,我使用maxdepth = 4训练新模型(常量)值):
set.seed(888)
DT_up_rpart2 <-train(DiagDM1~., data = training, method = "rpart2", trControl = ctrl2, metric = "ROC", tuneGrid =data.frame(maxdepth = 4))
我得到了这些结果:
## Resampling: Cross-Validated (10 fold, repeated 5 times)## Summary of sample sizes: 50995, 50995, 50996, 50996, 50996, 50996, ...
## Resampling results:
## ROC Sens Spec Accuracy Kappa
## 0.7379012 0.7391975 0.6464791 0.6928382 0.3856765
##
## Tuning parameter'maxdepth'was held constant at a value of 4
请注意,在上面显示的第一个结果中,maxdepth = 4得到ROC = 0.7382512,而在第二个模型中,maxdepth = 4得到ROC = 0.7379012,所有性能测量都是如此。我期待从两个模型中获得相同的结果。这就是我在两种情况下设置种子值的原因。
为什么我会得到不同的结果?我使用相同的数据集,相同的种子等等......或者我是以错误的方式定义种子?
任何帮助都将不胜感激。