采样代码
library(caret)
trainset <- data.frame(
class=factor(c("Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Good", "Good", "Bad", "Bad", "Bad")),
age=c(67, 22, 49, 45, 53, 35, 53, 35, 61, 28, 25, 24))
testset <- data.frame(
class=factor(c("Good", "Bad", "Good" )),
age=c(64, 23, 50))
library(kernlab)
set.seed(231)
### finding optimal value of a tuning parameter
sigDist <- sigest(class ~ ., data = trainset, frac = 1)
### creating a grid of two tuning parameters, .sigma comes from the earlier line. we are trying to find best value of .C
svmTuneGrid <- data.frame(.sigma = sigDist[1], .C = 2^(-2:7))
set.seed(1056)
svmFit <- train(class ~ .,
data = trainset,
method = "svmRadial",
preProc = c("center", "scale"),
tuneGrid = svmTuneGrid,
trControl = trainControl(method = "repeatedcv", repeats = 5,
classProbs = TRUE))
predictedClasses <- predict(svmFit, testset )
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
使用公式界面,此代码可以完美运行。但是,如果我将其翻转为使用矩阵形式,则在预测并返回错误(NA)时不会计算类概率。见下文。
set.seed(1056)
svmFit <- train(x = trainset["age"], y = trainset$class,
method = "svmRadial",
preProc = c("center", "scale"),
tuneGrid = svmTuneGrid,
trControl = trainControl(method = "repeatedcv", repeats = 5, classProbs = TRUE))
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
仅尝试弄清楚为什么它不使用非公式接口对预测的数据集计算概率。引发此警告:
Warning message:
In method$prob(modelFit = modelFit, newdata = newdata, submodels = param) :
kernlab class probability calculations failed; returning NAs