Question

我使用e1071 svm功能对我的数据进行分类。我尝试了两种不同的LOOCV方式。第一个是这样的，

svm.model <- svm(mem ~ ., data, kernel = "sigmoid", cost = 7, gamma = 0.009, cross = subSize)
svm.pred = data$mem
svm.pred[which(svm.model$accuracies==0 & svm.pred=='good')]=NA
svm.pred[which(svm.model$accuracies==0 & svm.pred=='bad')]='good'
svm.pred[is.na(svm.pred)]='bad'
conMAT <- table(pred = svm.pred, true = data$mem)
summary(svm.model)

我输入了cross =＆＃39;主题编号＆＃39;制作LOOCV，但分类的结果与我手册版的LOOCV有所不同，就像... ...

for (i in 1:subSize){
  data_Tst <- data[i,1:dSize]
  data_Trn <- data[-i,1:dSize]
  svm.model1 <- svm(mem ~ ., data = data_Trn, kernel = "linear", cost = 2, gamma = 0.02)
  svm.pred1 <- predict(svm.model1, data_Tst[,-dSize])
  conMAT <- table(pred = svm.pred1, true = data_Tst[,dSize])
  CMAT <- CMAT + conMAT
  CORR[i] <- sum(diag(conMAT))
}

在我看来，通过LOOCV，准确性不应该在许多代码运行中发生变化，因为SVM使用除了一个数据之外的所有数据建立模型，并且直到循环结束时才进行。但是，使用svm函数和参数＆＃39; cross＆＃39;输入，每次运行代码的准确度都不同。

哪种方式更准确？感谢阅读这篇文章！： - ）

Answer 1

You are using different hyper-parameters (cost, gamma) and different kernels (linear, sigmoid). If you want identical results, then these should be the same each run.

Also, it depends how Leave One Out (LOO) is implemented:

Does your LOO method leave one out randomly or as a sliding window over the dataset?
Does your LOO method leave one out from one class at a time or both classes at the same time?
Is the training set always the same, or are you using a randomisation procedure before splitting between a training and testing set (assuming you have a separate independent testing set)? In which case, the examples you are cross-validating would change each run.

R e1071 SVM留出一个交叉验证功能结果与手动LOOCV不同

1 个答案: