我想使用Haversine公式基于500个位置的经度和纬度生成距离矩阵500X500。
这是10个位置的示例数据“ coordinate.csv”:
Name,Latitude,Longitude
depot1,35.492807,139.6681689
depot2,33.6625572,130.4096027
depot3,35.6159881,139.7805445
customer1,35.622632,139.732631
customer2,35.857287,139.821461
customer3,35.955313,139.615387
customer4,35.16073,136.926239
customer5,36.118163,139.509548
customer6,35.937351,139.909783
customer7,35.949508,139.676462
获取距离矩阵后,我想根据距离矩阵找到最接近每个客户的仓库,然后将输出(每个客户到壁橱仓库的距离和最接近仓库的名称)保存到Pandas DataFrame。
预期输出:
// Distance matrix
[ [..],[..],[..],[..],[..],[..],[..],[..],[..],[..] ]
// Closet depot to each customer (just an example)
Name,Latitude,Longitude,Distance_to_closest_depot,Closest_depot
depot1,35.492807,139.6681689,,
depot2,33.6625572,130.4096027,,
depot3,35.6159881,139.7805445,,
customer1,35.622632,139.732631,10,depot1
customer2,35.857287,139.821461,20,depot3
customer3,35.955313,139.615387,15,depot2
customer4,35.16073,136.926239,12,depot3
customer5,36.118163,139.509548,25,depot1
customer6,35.937351,139.909783,22,depot2
customer7,35.949508,139.676462,15,depot1
答案 0 :(得分:1)
有一些库函数可以帮助您解决此问题:
在那之后,只是从距离矩阵中查找逐行最小值并将其添加到DataFrame的情况。完整代码如下:
import pandas as pd
from scipy.spatial.distance import cdist
from haversine import haversine
df = pd.read_clipboard(sep=',')
df.set_index('Name', inplace=True)
customers = df[df.index.str.startswith('customer')]
depots = df[df.index.str.startswith('depot')]
dm = cdist(customers, depots, metric=haversine)
closest = dm.argmin(axis=1)
distances = dm.min(axis=1)
customers['Closest Depot'] = depots.index[closest]
customers['Distance'] = distances
结果:
Latitude Longitude Closest Depot Distance
Name
customer1 35.622632 139.732631 depot3 4.393506
customer2 35.857287 139.821461 depot3 27.084212
customer3 35.955313 139.615387 depot3 40.565820
customer4 35.160730 136.926239 depot1 251.466152
customer5 36.118163 139.509548 depot3 60.945377
customer6 35.937351 139.909783 depot3 37.587862
customer7 35.949508 139.676462 depot3 38.255776
根据评论,我创建了一个替代解决方案,改为使用平方距离矩阵。我认为原始解决方案更好,因为该问题表明我们只想为每个客户找到最接近的仓库,因此不需要计算客户之间以及仓库之间的距离。但是,如果您出于其他目的需要平方距离矩阵,请按以下步骤创建它:
import pandas as pd
import numpy as np
from scipy.spatial.distance import squareform, pdist
from haversine import haversine
df = pd.read_clipboard(sep=',')
df.set_index('Name', inplace=True)
dm = pd.DataFrame(squareform(pdist(df, metric=haversine)), index=df.index, columns=df.index)
np.fill_diagonal(dm.values, np.inf) # Makes it easier to find minimums
customers = df[df.index.str.startswith('customer')]
depots = df[df.index.str.startswith('depot')]
customers['Closest Depot'] = dm.loc[depots.index, customers.index].idxmin()
customers['Distance'] = dm.loc[depots.index, customers.index].min()
最终结果与以前相同,除了现在有了平方距离矩阵。如果愿意,可以在提取最小值之后将0放回对角线上。
np.fill_diagonal(dm.values, 0)