Question

我得到了一个拥有数百万分的N维空间。我正在寻找建立模型的最有效方法，该模型允许在运行时找到最接近给定点的K（K <100）点。

列出FindClosestMatch（点目标，模型模型）

我开始关注R * -trees，但想知道这是否是正确的方法......

Answer 1

R-Tree变体是一个不错的选择，但M-tree对您的应用程序来说更好一些，因为您只需要计算一个距离来确定边界球与目标点的接近程度：

https://en.wikipedia.org/wiki/M-tree

Answer 2

您可以将空间划分为n个多维数据集，以使多维数据集中预期的点数为rk，而某些值为r> 1。 1待定。然后对于给定的查询p：

1. Consider the 3^n n-cubes at most 1 away from the cube p is in.
2. Calculate the shortest distance d between p and one of these cubes.
3. Find the distance between p and each point in these cubes.
4. If the total number of points within d is >= k, return the k closest.
5. If not, expand your radius by one cube and repeat.

当你选择r时，你需要使用更大的默认点来进行搜索，而不必将半径扩展1次或更多次。

Answer 3

您也可以查看Cover Trees，这些内容尤其适用于kNN搜索。我还发现PH-Tree（我自己）的表现几乎与15或25维的封面树一样好，同时在内存中的空间效率更高，特别是对于大型数据集，并允许快速插入/更新。 / p>

ELKI Framework中提供了许多算法的比较Java实现。

在N维空间中找到最近的N点到一个给定点

3 个答案: