我有一个包含纬度和经度点(apprx 1m)的数据框。
我想计算每个点与所有其他点的半径距离。
示例:
import haversine
lat1 = 40.5; lat2 = 42; long1 = -90; long2 = -93
print( haversine.distance((lat1, long1), (lat2, long2)) )
但是计算1mx1m的计算没有意义,因为对于每个循环,我们需要比之前少1个计算,例如,点1与2的距离与点2的距离相同。
如何减少每一步的计算?
答案 0 :(得分:0)
使用矢量化方法:
演示:
In [105]: from sklearn.neighbors import DistanceMetric
In [106]: dist = DistanceMetric.get_metric('haversine')
In [107]: df
Out[107]:
latitude longitude
0 38.550420 -121.391416
1 38.473501 -121.490186
2 38.657846 -121.462101
3 38.506774 -121.426951
4 38.637448 -121.384613
In [108]: earth_radius = 6371
...: D = dist.pairwise(np.radians(df), np.radians(df)) * earth_radius
...:
In [109]: D
Out[109]:
array([[ 0. , 12.12461135, 13.43188915, 5.75400608, 9.69511663],
[ 12.12461135, 0. , 20.64315089, 6.63158885, 20.41101851],
[ 13.43188915, 20.64315089, 0. , 17.07403265, 7.10128697],
[ 5.75400608, 6.63158885, 17.07403265, 0. , 14.9892082 ],
[ 9.69511663, 20.41101851, 7.10128697, 14.9892082 , 0. ]])