SVM参数调优的性能不佳(R)

时间:2018-07-09 15:35:45

标签: r svm cross-validation

所以我有以下代码,其中我使用默认参数执行SVM,后来我使用10倍CV进行参数调音

library(readr)
library("e1071")
wdbc <- read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data", 
                   col_names = FALSE, col_types = cols(X1 = col_skip(), 
                   X2 = col_factor(levels = c("M", "B"))))

smp_size <- floor(0.75 * nrow(wdbc))

set.seed(2)
train_ind <- sample(seq_len(nrow(wdbc)), size = smp_size)

train <- wdbc[train_ind, ]
test <- wdbc[-train_ind, ]

model <- svm(X2~., data=train, kernel="radial", probability = TRUE)
predicted <- predict(model, test[,-1], probability = TRUE)

CM <- table(test$X2, predicted)
print(CM)

svm_tune <- tune(svm, train[,-1], train.y=train$X2, 
            kernel="radial", ranges=list(cost=10^(-1:2), gamma=c(.5,1,2)))
summary(svm_tune)

model_after_tune <- svm(X2~., data=train, kernel="radial", probability = TRUE, gamma = 0.5, cost = 10)
predicted <- predict(model_after_tune, test[,-1], probability = TRUE)

#attr(predicted, "probabilities")
CM <- table(test$X2, predicted)
print(CM)

当我为默认的svm预测打印他的混淆矩阵时,我得到3个未分类的点。调整参数时,我得到了15。 我尝试使用其他种子,以查看是否可以获得更好的结果,但是默认参数似乎总是可以更好地工作。 任何线索为什么会这样? 谢谢

0 个答案:

没有答案