计算两个3D阵列之间的按元素的欧式距离

时间:2019-11-28 00:16:16

标签: python arrays numpy scipy

如何计算2个numpy数组之间的按元素的欧式距离?例如;我有两个尺寸均为3x3的数组(称为数组A和数组B),我想计算值A [0,0]和B [0,0]之间的欧式距离。然后,我想计算值A [0,1]和B [0,1]之间的欧式距离。等等。因此输出数组也将是3x3。

如果我尝试使用scipy.spatial.distancecdist,则会收到错误消息ValueError: XA must be a 2-dimensional array.

import numpy as np
from scipy.spatial.distance import cdist

a = np.array([
    [(0,255,0),(255,255,0),(0,255,0)],
    [(0,255,0),(255,255,0),(0,255,0)],
    [(0,255,0),(255,255,0),(0,255,0)],
])

b = np.array([
    [(255,255,0),(255,255,0),(0,255,0)],
    [(255,255,0),(255,255,0),(0,255,0)],
    [(255,255,0),(255,255,0),(0,255,0)],
])

dists = cdist(a, b, 'euclidean')
print(dists)
  • 我真的很想使用scipy函数,因为我可以轻松地在其函数中使用其他距离度量。例如; cdist(a,b,'cityblock')cdist(a,b,'sqeuclidean')

编辑,我想要的输出是这样的(数学已经组成,但是数组尺寸是正确的3x3):

[[100, 0, 100]
[100, 0, 100]
[100, 0, 100]]

也就是说,我期望:

[[cdist((0,255,0), (255,255,0)), cdist((0,255,0), (255,255,0)), cdist((0,255,0), (255,255,0)), 
 [...]
 [...]]

3 个答案:

答案 0 :(得分:1)

下面列出了几种方法。

方法1

this post的启发,我们可以以向量化的方式解决它。因此,遵循wiki contents包中的eucl_dist(免责声明:我是它的作者),我们可以利用matrix-multiplication和一些NumPy specific implementations,就像这样-

def elementwise_cdist_v1(a,b):
    s_a = np.einsum('ijk,ijk->ij',a,a)
    s_b = np.einsum('ijk,ijk->ij',b,b)
    return np.sqrt(s_a+s_b-2*np.einsum('ijk,ijk->ij',a,b))

方法2

这是使用np.einsum并以类似方式实现-

def elementwise_cdist_v2(a,b):
    d = a-b
    return np.sqrt(np.einsum('ijk,ijk->ij',d,d))

大型数组上的计时-

我们使用的最后一条轴的长度为3的随机数据,这是处理xyz坐标数据时的常见情况,

In [72]: np.random.seed(0)
    ...: a = np.random.rand(1000,1000,3)
    ...: b = np.random.rand(1000,1000,3)

In [73]: %timeit elementwise_cdist_v1(a,b)
10 loops, best of 3: 23.9 ms per loop

In [74]: %timeit elementwise_cdist_v2(a,b)
100 loops, best of 3: 13.2 ms per loop

答案 1 :(得分:0)

减少输入数组的维数,并且可以使用。

import numpy as np
from scipy.spatial.distance import cdist

a = np.array([
    (0,255,0),
    (255,255,0),
    (0,255,0),
    (0,255,0),
    (255,255,0),
    (0,255,0),
    (0,255,0),
    (255,255,0),
    (0,255,0),
])

b = np.array([
    (255,255,0),
    (255,255,0),
    (0,255,0),
    (255,255,0),
    (255,255,0),
    (0,255,0),
    (255,255,0),
    (255,255,0),
    (0,255,0),
])

dist_matrix=cdist(a,b)
pair_dist=np.diag(dist_matrix,0)

dist3x3=np.reshape(pair_dist,(3,3))
print("pair_dist\n",pair_dist)
print("dist3x3\n",dist3x3)

输出:

dist_matrix
[[255. 255.   0. 255. 255.   0. 255. 255.   0.]
 [  0.   0. 255.   0.   0. 255.   0.   0. 255.]
 [255. 255.   0. 255. 255.   0. 255. 255.   0.]
 [255. 255.   0. 255. 255.   0. 255. 255.   0.] 
 [  0.   0. 255.   0.   0. 255.   0.   0. 255.] 
 [255. 255.   0. 255. 255.   0. 255. 255.   0.]
 [255. 255.   0. 255. 255.   0. 255. 255.   0.]
 [  0.   0. 255.   0.   0. 255.   0.   0. 255.]
 [255. 255.   0. 255. 255.   0. 255. 255.   0.]]

dist
 [255.   0.   0. 255.   0.   0. 255.   0.   0.]
dist3x3
 [[255.   0.   0.]
 [255.   0.   0.]
 [255.   0.   0.]]

答案 2 :(得分:0)

首先为欧几里得案例提供一个简单的NumPy解决方案:

>>> np.sqrt(np.sum((a-b)**2, axis=2))
array([[255.,   0.,   0.],
       [255.,   0.,   0.],
       [255.,   0.,   0.]])

您说您想改用cdist。请注意,使用cdist进行逐元素距离计算很浪费,因为此函数计算所有成对元素之间的距离。但是,如果性能不是问题,请尝试以下操作:

>>> np.diag(cdist(a.reshape(-1, 3), b.reshape(-1, 3), 'euclidean')).reshape(-1, 3)
array([[255.,   0.,   0.],
       [255.,   0.,   0.],
       [255.,   0.,   0.]])

编辑:一种解决方案,其内存需求将随着阵列的大小而更合理地扩展:

>>> np.array([
        cdist(x, y, 'euclidean')
        for (x, y) in zip(a.reshape(-1, 1, 3), b.reshape(-1, 1, 3))
    ]).reshape(-1, 3)
array([[255.,   0.,   0.],
       [255.,   0.,   0.],
       [255.,   0.,   0.]])