“预测的格式不正确”

时间:2016-11-04 18:27:02

标签: r compiler-errors prediction fold knn

实施ROCR曲线,kNN,K 10倍交叉验证。 我正在使用Ionosphere数据集。

以下是您参考的属性信息:

- 如上所述,所有34个都是连续的     - 根据定义,第35个属性是“好”或“坏”       总结如上。这是一个二进制分类任务。

data1<-read.csv('https://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/ionosphere.data',header = FALSE)

打造自己的作品,kNN与kfold也有效。但是当我输入ROCR代码时,它不喜欢它。  我收到错误:“预测的格式不正确”。  我检查了数据帧pred和Class 1.尺寸相同。我试过data.test $ V35而不是Class1我用这个选项得到了同样的错误。

 install.packages("class")
    library(class)

nrFolds <- 10
data1[,35]<-as.numeric(data1[,35])

# generate array containing fold-number for each sample (row)
folds <- rep_len(1:nrFolds, nrow(data1))


# actual cross validation
for(k in 1:nrFolds) {
  # actual split of the data
  fold <- which(folds == k)
  data.train <- data1[-fold,]
  data.test <- data1[fold,]

  Class<-data.train[,35]
  Class1<-data.test[,35]
  # train and test your model with data.train and data.test

  pred<-knn(data.train, data.test, Class, k = 5, l = 0, prob = FALSE, use.all = TRUE)
  data<-data.frame('predict'=pred, 'actual'=Class1)
  count<-nrow(data[data$predict==data$actual,])
  total<-nrow(data.test)
  avg = (count*100)/total
  avg =format(round(avg, 2), nsmall = 2)
  method<-"KNN" 
  accuracy<-avg
  cat("Method = ", method,", accuracy= ", accuracy,"\n")
}

install.packages("ROCR")
library(ROCR)
rocrPred=prediction(pred, Class1, NULL)
rocrPerf=performance(rocrPred, 'tpr', 'fpr')
plot(rocrPerf, colorize=TRUE, text.adj=c(-.2,1.7))

感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

这对我有用..

install.packages("class")
library(class)
library(ROCR)
nrFolds <- 10
data1[,35]<-as.numeric(data1[,35])

# generate array containing fold-number for each sample (row)
folds <- rep_len(1:nrFolds, nrow(data1))


# actual cross validation
for(k in 1:nrFolds) {
  # actual split of the data
  fold <- which(folds == k)
  data.train <- data1[-fold,]
  data.test <- data1[fold,]

  Class<-data.train[,35]
  Class1<-data.test[,35]
  # train and test your model with data.train and data.test

  pred<-knn(data.train, data.test, Class, k = 5, l = 0, prob = FALSE, use.all = TRUE)
  data<-data.frame('predict'=pred, 'actual'=Class1)
  count<-nrow(data[data$predict==data$actual,])
  total<-nrow(data.test)
  avg = (count*100)/total
  avg =format(round(avg, 2), nsmall = 2)
  method<-"KNN" 
  accuracy<-avg
  cat("Method = ", method,", accuracy= ", accuracy,"\n")

  pred <- prediction(Class1,pred)
  perf <- performance(pred, "tpr", "fpr")
  plot(perf, colorize=T, add=TRUE)
  abline(0,1)
}