我正在使用r语言和不平衡的数据集,我需要知道如何获得数据集的k个最近邻居,因为我需要它们来创建新的合成示例。
set.seed(123)
test <- 1:100
train.gc <- gc.subset[-test,]
test.gc <- gc.subset[test,]
train.def <- gc$Default[-test]
test.def <- gc$Default[test]
library(class)
knn.5 <- knn(train.gc, test.gc, train.def, k=5)
#how can i get the five nearest neighbours????????
答案 0 :(得分:0)
虽然它似乎没有记录,但knn
的帮助提示属性可能存储内容:
train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
k = knn(train, test, cl, k = 3, prob=TRUE)
names(attributes(k))
# [1] "levels" "class" "prob" "nn.index" "nn.dist"
我猜测nn.index
是邻居的索引:
> head(attr(k,"nn.index"))
[,1] [,2] [,3]
[1,] 10 2 13
[2,] 24 8 18
[3,] 1 18 8
[4,] 1 18 8
我猜这些是前四个数据点的3个最近邻居。