Question

我在R中运行knn模型，并试图找到最佳k。为此，我构建了以下代码。

suppressMessages(library(class))
set.seed(1)
Lag1 = rnorm(30)
Direction = c(0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1)
X_train <- data.frame(Lag1[1:20])
X_test <- data.frame(Lag1[21:30])
Y_train <- data.frame(Direction[1:20])
Y_test <- data.frame(Direction[21:30])
knn_res <- rep(1,10)
for (i in 1:10) {
  predk <- knn(X_train, X_test, Y_train[,1], k=i)
  cm <- as.matrix(table(predk, Y_test[,1]))
  knn_res[i] <- sum(diag(cm))/length(predk)
}

# which is most optimal
which.max(knn_res)

# looks like K = 1 is the most optimal
predk <- knn(X_train, X_test, Y_train[,1], k=1)
cm <- as.matrix(table(predk1, Y_test[,1]))
sum(diag(cm))/length(predk)

根据which.max(knn_res)，我的最佳k应该是，但是当我从循环中运行确切的代码以打印我的混淆矩阵时，返回的精度与我的knn_res列表中的精度不匹配。 knn_res[1]返回0.5，而sum(diag(cm))/length(predk)返回0.3。

我去哪里了？我觉得这是我要添加到knn_res列表中的方式，但是我不确定是什么...

knn循环结果列表中的准确性与实际knn准确性不同

0 个答案: