Question

我想使用scipy.spatial的KDTree来查找二维数组中的最近邻居对（实际上是嵌套列表的维度为2的列表列表）。我生成我的列表列表，将其输入numpy的数组，然后创建KDTree实例。但是，每当我尝试对其运行“查询”时，我都不可避免地会得到奇怪的答案。例如，当我输入：

tree = KDTree(array)
nearest = tree.query(np.array[1,1])

最近打印出来（0.0,0）。目前，我正在使用一个基本上y = x的数组作为范围（1,50）所以我希望我得到（2,1）的最近邻居（1,1）

我做错了什么，狡猾的大师？

编辑：或者，如果有人可以指向我用于python的KDTree包，他们已经用于最近邻搜索给定点，我很乐意听到它。

Answer 1

之前我使用过scipy.spatial，与scikits.ann相比，它似乎是一个很好的改进（尤其是界面）。

在这种情况下，我认为您已经混淆了tree.query(...)来电的回报。来自scipy.spatial.KDTree.query docs：

Returns
-------

d : array of floats
    The distances to the nearest neighbors.
    If x has shape tuple+(self.m,), then d has shape tuple if
    k is one, or tuple+(k,) if k is larger than one.  Missing
    neighbors are indicated with infinite distances.  If k is None,
    then d is an object array of shape tuple, containing lists
    of distances. In either case the hits are sorted by distance
    (nearest first).
i : array of integers
    The locations of the neighbors in self.data. i is the same
    shape as d.

因此，在这种情况下，当您查询距离[1,1]最近的地方时：

distance to nearest: 0.0
index of nearest in original array: 0

这意味着[1,1]是array中原始数据的第一行，如果您的数据为y = x on the range [1,50]，则可以预期这一行。

scipy.spatial.KDTree.query函数有很多其他选项，所以如果你想确保获得本身不是最近的邻居，请尝试：

tree.query([1,1], k=2)

这将返回两个最近邻居，您可以应用更多逻辑，以便返回距离为零的情况（即查询的点是用于构建树的数据项之一）第二个最近邻居被取而不是第一个。

使用scipy.spatial的数据类型问题

1 个答案: