Question

我正在使用r语言和不平衡的数据集，我需要知道如何获得数据集的k个最近邻居，因为我需要它们来创建新的合成示例。

   set.seed(123) 
    test <- 1:100
    train.gc <- gc.subset[-test,]
    test.gc <- gc.subset[test,]

    train.def <- gc$Default[-test]
    test.def <- gc$Default[test]

    library(class)
    knn.5 <-  knn(train.gc, test.gc, train.def, k=5)
    #how can i get the five nearest neighbours????????

Answer 1

虽然它似乎没有记录，但knn的帮助提示属性可能存储内容：

 train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
 test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
 cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
 k = knn(train, test, cl, k = 3, prob=TRUE)
 names(attributes(k))
 # [1] "levels"   "class"    "prob"     "nn.index" "nn.dist"

我猜测nn.index是邻居的索引：

> head(attr(k,"nn.index"))
     [,1] [,2] [,3]
[1,]   10    2   13
[2,]   24    8   18
[3,]    1   18    8
[4,]    1   18    8

我猜这些是前四个数据点的3个最近邻居。

KNN用于在过采样中创建新的合成示例

1 个答案: