Question

我搜索了一下，找到了类似的问题/答案，但没有一个为我返回正确的结果。

情况： 我有一个数组具有多个值== 1的数组，而其余的单元格设置为零。每个单元格是正方形（宽度=高度）。现在我想计算所有1个值之间的平均距离。公式应如下所示：d = sqrt ( (( x2 - x1 )*size)**2 + (( y2 - y1 )*size)**2 )

示例：

import numpy as np
from scipy.spatial.distance import pdist

a = np.array([[1, 0, 1],
              [0, 0, 0],
              [0, 0, 1]])

# Given that each cell is 10m wide/high
val = 10
d = pdist(a, lambda u, v: np.sqrt( ( ((u-v)*val)**2).sum() ) )
d
array([ 14.14213562,  10.        ,  10.        ])

之后我会通过d.mean()计算平均值。然而，d中的结果显然是错误的，因为顶行中的单元格之间的距离应该已经是20（两个交叉单元格* 10）。我的公式，数学或方法有问题吗？

Answer 1

您需要非零标记的实际坐标，以计算它们之间的距离：

>>> import numpy as np
>>> from scipy.spatial.distance import squareform, pdist
>>> a = np.array([[1, 0, 1],
...               [0, 0, 0],
...               [0, 0, 1]])
>>> np.where(a)
(array([0, 0, 2]), array([0, 2, 2]))
>>> x,y = np.where(a)
>>> coords = np.vstack((x,y)).T
>>> coords
array([[0, 0],   # That's the coordinate of the "1" in the top left,
       [0, 2],   # top right,
       [2, 2]])  # and bottom right.

接下来，您要计算这些点之间的距离。您可以使用pdist，例如：

>>> dists = pdist(coords) * 10  # Uses the Euclidean distance metric by default.
>>> squareform(dists)
array([[  0.        ,  20.        ,  28.28427125],
       [ 20.        ,   0.        ,  20.        ],
       [ 28.28427125,  20.        ,   0.        ]])

在最后一个矩阵中，您会发现（在对角线上方）a中每个标记点与另一个坐标之间的距离。在这种情况下，你有3个坐标，所以它给你节点0（a[0,0]）和节点1（a[0,2]），节点0和节点2（a[2,2]）之间的距离，最后在节点1和节点2之间。如果是S = squareform(dists)，那么S[i,j]将返回i行coords上的坐标与行{{}之间的距离。 1}}。

只有最后一个矩阵的上三角形中的值也存在于变量j中，您可以从中轻松导出均值，而无需执行相对昂贵的dist计算（此处仅用于演示目的）：

squareform

请注意，由于您选择的示例，您的计算解决方案“看起来”几乎正确（除了因子2）。 >>> dists array([ 20. , 28.2842712, 20. ]) >>> dists.mean() 22.761423749153966的作用是，它是n维空间中第一个点与第二个点之间的欧几里德距离，然后是第一个点和第三个点之间的距离，依此类推。在您的示例中，这意味着，它计算第0行上的点之间的距离：该点具有由pdist给出的3维空间中的坐标。第二点是[1,0,1]。这两个[0,0,0]之间的欧几里德距离。然后，第一个和第三个坐标（sqrt(2)~1.4中的最后一行）之间的距离仅为a。最后，第二个坐标（第1行：1）和第3个（最后一行，第2行：[0,0,0]）之间的距离也是[0,0,1]。所以请记住，1将其第一个参数解释为n维空间中的坐标堆栈，pdist是每个节点元组中元素的数量。

计算numpy值之间的平均加权欧氏距离

1 个答案: