我正在对3种不同的算法执行10倍交叉验证:SVM,神经网络,rpart。我正在尝试选择最准确的算法
rpart错误是:
[1] 0.9705882353 1.0000000000 1.0000000000 0.9411764706 1.0000000000 1.0000000000 0.8823529412 1.0000000000
[9] 0.9705882353 0.9705882353
神经错误是:
[1] 157.52352368 80.07471671 95.11278873 78.70281592 100.58281184 79.61699438 90.91953877 120.54595936
[9] 143.57563143 41.78655472
svm错误为:
[1] 60.89085317 86.23601068 115.46072775 75.03890373 70.18948759 102.37164174 117.48471252 61.60451089
[9] 89.44647999
我使用的代码: 对于神经网络代码:
for(i in 1:k){
#index = sample(seq_len (nrow(leaf)), size = samplesize)
index <- sample(1:nrow(leaf),round(0.9*nrow(leaf)))
# index = total_index[(i*(k-1)+1):(i*(k-1)+k)]
train.cv <- scaled[index,]
test.cv <- scaled[-index,]
nn <- neuralnet(form, train.cv,hidden=c(5,2),linear.output=T, stepmax = 1e6)
pr.nn <- compute(nn,test.cv[,2:16])
pr.nn <- pr.nn$net.result*(max(leaf$Class)-min(leaf$Class))+min(leaf$Class)
test.cv.r <- (test.cv$Class)*(max(leaf$Class)-min(leaf$Class))+min(leaf$Class)
cv.error[i] <- sum((test.cv.r - pr.nn)^2)/nrow(test.cv)
pbar$step()
}
对于rpart:
set.seed(123)
form = "Class ~ SpecimenNumber+Eccentricity+AspRatio+Elongation+Solidity+StoConvex+IsoFactor+MaxIndentDepth+Lobeedness+AvgIntensity+AvgContrast+Smoothness+ThirdMoment+Uniformity+Entropy"
folds = split(scaled, cut(sample(1:nrow(scaled)), 10))
errs = rep(NA, length(folds))
for (i in 1:length(folds)) {
test <- ldply(folds[i], data.frame)
train <- ldply(folds[-i], data.frame)
# train <- scaled[index,]
#test <- scaled[-index,]
tmp.model <- rpart(form , train, method = "class")
tmp.predict <- predict(tmp.model, newdata = test, type = "class")
conf.mat <- table(test$Class, tmp.predict)
errs[i] <- 1-(sum(diag(conf.mat))/sum(conf.mat))
}
对于svm:
tuned = tune.svm(Class~., data = traindata, gamma = 10^(-1:-3), cost =10^(1:3), tunecontrol = tune.control(cross = 10))
叶是实际数据,已缩放是使用max / mins缩放的叶数据
我需要比较这些错误,并选择最准确的算法。我不知道如何使用具有不同单位的算法来做到这一点。 rpart给我的错误是0到1之间,而其他两个给我的错误是整数。我无法弄清楚它们产生不同单位的算法是什么。如何获得交叉验证,以全面赋予我相同的指标?