计算(x, y)
平面中两点之间距离的公式为fairly known and straightforward。
但是,对于要计算平均距离的n
点问题的最佳方法是什么?
示例:
import matplotlib.pyplot as plt
x=[89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y=[78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]
plt.scatter(x, y,color='k')
plt.show()
距离简单地呈现为:
import math
dist=math.sqrt((x2-x1)**2+(y2-y1)**2)
但这是一个不允许重复组合的问题。怎么接近它?
答案 0 :(得分:9)
itertools.combinations
给出没有重复的组合:
>>> for combo in itertools.combinations([(1,1), (2,2), (3,3), (4,4)], 2):
... print(combo)
...
((1, 1), (2, 2))
((1, 1), (3, 3))
((1, 1), (4, 4))
((2, 2), (3, 3))
((2, 2), (4, 4))
((3, 3), (4, 4))
问题代码:
import math
from itertools import combinations
def dist(p1, p2):
(x1, y1), (x2, y2) = p1, p2
return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
x = [89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y = [78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]
points = list(zip(x,y))
distances = [dist(p1, p2) for p1, p2 in combinations(points, 2)]
avg_distance = sum(distances) / len(distances)
答案 1 :(得分:4)
在这种情况下,你需要在点序列上循环:
from math import sqrt
def avg_distance(x,y):
n = len(x)
dist = 0
for i in range(n):
xi = x[i]
yi = y[i]
for j in range(i+1,n):
dx = x[j]-xi
dy = y[j]-yi
dist += sqrt(dx*dx+dy*dy)
return 2.0*dist/(n*(n-1))
在最后一步中,我们将总距离除以 n×(n-1)/ 2 ,这是以下结果:
n-1
---
\ n (n-1)
/ i = -------
--- 2
i=1
因此是我们计算的距离的总数量。
这里我们不测量点与自身之间的距离(当然总是0)。请注意,这当然会影响平均值,因为您也不计算它们。
鉴于有 n 点,此算法在 O(n 2 )中运行。
答案 2 :(得分:1)
您可以使用Scipy库中的函数pdist解决此问题(可能更有效)。该函数可计算n维空间中观测值之间的成对距离。
要解决此问题,可以使用以下功能:
from scipy.spatial.distance import pdist
import numpy as np
def compute_average_distance(X):
"""
Computes the average distance among a set of n points in the d-dimensional space.
Arguments:
X {numpy array} - the query points in an array of shape (n,d),
where n is the number of points and d is the dimension.
Returns:
{float} - the average distance among the points
"""
return np.mean(pdist(X))