Question

我有3个巨大的numpy数组，我想构建一个函数，以成对计算从一个数组的点到第二个数组和第三个数组的点的欧几里得距离。

为简单起见，假设我有以下3个数组：

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

c = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

我已经尝试过了：

def correlation(x, y, t):
    from math import sqrt

    for a,b, in zip(x,y,t):
        distance = sqrt((x[a]-x[b])**2 + (y[a]-y[b])**2 + (t[a]-t[b])**2 )
    return distance

但是此代码引发错误：ValueError: too many values to unpack (expected 2)

如何使用numpy或基本python正确实现此功能？

预先感谢

Answer 1

首先，我们定义一个函数，该函数计算两个矩阵的每对行之间的距离。

def pairwise_distance(f, s, keepdims=False):
    return np.sqrt(np.sum((f-s)**2, axis=1, keepdims=keepdims))

第二，我们定义一个函数，该函数计算同一矩阵的每对行之间的所有可能距离：

def all_distances(c):
    res = np.empty(shape=c.shape, dtype=float)
    for row in np.arange(c.shape[0]):
        res[row, :] = pairweis_distance(c[row], c) #using numpy broadcasting
    return res

现在我们完成了

row_distances = all_distances(a) #row wise distances of the matrix a
column_distances = all_distances(a) #column wise distances of the same matrix
row_distances[0,2] #distance between first and third row
row_distances[1,3] #distance between second and fourth row

Answer 2

从两个数组开始：

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

要计算这些数组元素之间的距离，可以执行以下操作：

pairwise_dist_between_a_and_b=[(each**2+b[index]**2)**0.5 for index, each in enumerate(a)]

这样做可以得到pairwise_dist_between_a_and_b：

[array([2.31931024e+00, 1.41421356e-03, 2.20617316e+00, 1.41421356e-01]),
 array([2.34193766e+00, 1.71119841e+00, 4.52548340e-01, 1.41421356e-04]),
 array([1.41449641e+00, 4.24264069e-04, 1.57119127e+00, 4.24264069e-04]),
 array([0.31536962, 0.94257334, 1.72831039, 2.3461803 ])]

您可以对第一个数组和第三个数组使用相同的列表理解。

计算3 numpy数组之间从零开始的欧几里得距离

2 个答案: