R中的KKNN包中最近的邻居在使用整个数据集时给出垃圾索引值

时间:2015-04-19 01:25:33

标签: r machine-learning nearest-neighbor knn

我正在使用" kknn"在R中打包以查找数据集中每一行的所有最近邻居。由于某些奇怪的原因,测试数据集中的最后一行始终被忽略。下面是R代码和输出。

X1 <- c(0.6439659, 0.1923593, 0.3905551, 0.7728847, 0.7602632)
X2 <- c(0.9147394, 0.6181713, 0.8515923, 0.8459367, 0.9296278)
Class <- c(1, 1, 0, 0, 0)
Data <- data.frame(X1,X2,Class)
Data$Class <- as.factor(Data$Class)
library("kknn")
### Here, both training and testing data sets is the object Data
Neighbors.KNN <- kknn(Data$Class~., Data,Data,k = 5, distance =2, kernel = "gaussian")

## Output 
## The Column 5 in the below output is filled with garbage values and the value of the first value in the last row is 4, when it has to be 5.
Neighbors.KNN$C  
     [,1] [,2] [,3] [,4]    [,5]
[1,]    1    4    3    2 3245945
[2,]    2    3    4    1 3245945
[3,]    3    1    4    2 3245945
[4,]    4    1    3    2 3245945
[5,]    1    4    3    2 3245945

如果我做错了或者包裹中有错误,有人会告诉我吗?

1 个答案:

答案 0 :(得分:2)

当前实现(静默地)假设k小于n,即行数。通常将是k <&lt; ñ这个案子没问题。第(k + 1)th用于缩放距离。我应该在文档中提到这一点。

此致 克劳斯