我应该如何理解:灵敏度太低,在AUG非常高的插入符号列交叉验证重新取样结果对我训练过的数据。
模特表现不好吗?
答案 0 :(得分:0)
它通常发生在类不平衡时,默认的50%概率截止值会产生较差的预测,但类概率虽然校准不佳,但在分类良好方面表现良好。
以下是一个例子:
library(caret)
set.seed(1)
dat <- twoClassSim(500, intercept = 10)
set.seed(2)
mod <- train(Class ~ ., data = dat, method = "svmRadial",
tuneLength = 10,
preProc = c("center", "scale"),
metric = "ROC",
trControl = trainControl(search = "random",
classProbs = TRUE,
summaryFunction = twoClassSummary))
结果
> mod
Support Vector Machines with Radial Basis Function Kernel
500 samples
15 predictor
2 classes: 'Class1', 'Class2'
Pre-processing: centered (15), scaled (15)
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 500, 500, 500, 500, 500, 500, ...
Resampling results across tuning parameters:
sigma C ROC Sens Spec
0.01124608 21.27349102 0.9615725 0.33389177 0.9910125
0.01330079 419.19384543 0.9579240 0.34620779 0.9914320
0.01942163 85.16782989 0.9535367 0.33211255 0.9920583
0.02168484 632.31603140 0.9516538 0.33065224 0.9911863
0.02395674 89.03035078 0.9497636 0.32504906 0.9909382
0.03988581 3.58620979 0.9392330 0.25279365 0.9920611
0.04204420 699.55658836 0.9356568 0.23920635 0.9931667
0.05263619 0.06127242 0.9265497 0.28134921 0.9839818
0.05364313 34.57839446 0.9264506 0.19560317 0.9934489
0.08838604 47.84104078 0.9029791 0.06296825 0.9955034
ROC was used to select the optimal model using the largest value.
The final values used for the model were sigma = 0.01124608 and C = 21.27349.