应用错误收集

我有一个具有以下结构的数据集：

Number of Rows = 1000

每一行对应一篇文章

Number of Columns = 502

Column 1 : Title of the article
Column 2 : Text of the article
From column  3 to column 502 : Each column corresponds to a unique word appearing in the corpus of all the articles in the data set and gives the count of this word in the article

鉴于来自上述数据集的单篇文章的标题，问题是使用k-最近邻居算法在内容方面找到10篇与其最接近的文章。我可以从头开始实现这个算法，但我想在R中使用内置函数。 knn https://stat.ethz.ch/R-manual/R-devel/library/class/html/knn.html的R文档似乎是用于监督学习请解释如何在这种情况下使用它，这实际上是无监督学习的情况

如何使用R

0 个答案: