Question

给定一个具有理想x，y位置的词典，我有一个无序的真实x，y位置列表，它们接近理想位置，我需要将它们分类到相应的理想位置字典键。有时候，对于给定的位置，我根本没有得到任何数据（0,0）。示例数据集是：

idealLoc= {1:(907,1026),
           2:(892,1152),
           3:(921,1364),
           4:(969,1020),
           5:(949,1220),
           6:(951,1404),
   'No_Data':(0,0)}

realLoc = [[  892.,  1152.],
           [  969.,  1021.],
           [  906.,  1026.],
           [  949.,  1220.],
           [  951.,  1404.],
           [    0.,     0.]]

输出将是一个新词典，其真实位置分配给来自idealLoc的正确词典键。我考虑过蛮力方法（每次最佳匹配扫描整个列表n次），但我想知道是否有更优雅/更有效的方法？

编辑：以下是＆＃34;粗暴＆＃34;力法

Dest = {}
dp = 6
for (y,x) in realLoc:
    for key, (r,c) in idealLoc.items():   
        if x > c-dp and x < c+dp and y > r-dp and y < r+dp:
            Dest[key] = [y,x]
            break

Answer 1

K-d trees是对数据进行分区以便执行快速最近邻搜索的有效方法。您可以使用scipy.spatial.cKDTree来解决您的问题：

import numpy as np
from scipy.spatial import cKDTree

# convert inputs to numpy arrays
ilabels, ilocs = (np.array(vv) for vv in zip(*idealLoc.iteritems()))
rlocs = np.array(realLoc)

# construct a K-d tree that partitions the "ideal" points
tree = cKDTree(ilocs)

# query the tree with the real coordinates to find the nearest "ideal" neigbour
# for each "real" point
dist, idx = tree.query(rlocs, k=1)

# get the corresponding labels and coordinates
print(ilabels[idx])
# ['2' '4' '1' '5' '6' 'No_Data']

print(ilocs[idx])
# [[ 892 1152]
#  [ 969 1020]
#  [ 907 1026]
#  [ 949 1220]
#  [ 951 1404]
#  [   0    0]]

默认情况下，cKDTree使用欧几里德范数作为距离指标，但您也可以通过将p=关键字参数传递给tree.query()来指定曼哈顿范数，最大范数等。 / p>

还有scipy.interpolate.NearestNDInterpolator类，它基本上只是scipy.spatial.cKDTree周围的便利包装。

Answer 2

假设您想使用欧氏距离，您可以使用scipy.spatial.distance.cdist计算距离矩阵，然后选择最近的点。

import numpy
from scipy.spatial import distance

ideal = numpy.array(idealloc.values())
real = numpy.array(realloc)

dist = distance.cdist(ideal, real)

nearest_indexes = dist.argmin(axis=0)

在python中，将预期值与实际值匹配的好方法是什么？

2 个答案: