我在熊猫中有以下dataframe
:
import pandas as pd
df = pd.DataFrame({
"CityId": {
"0": 0,
"1": 1,
"2": 2,
"3": 3,
"4": 4
},
"X": {
"0": 316.83673906150904,
"1": 4377.40597216624,
"2": 3454.15819771172,
"3": 4688.099297634771,
"4": 1010.6969517482901
},
"elevation_meters": {
"0": 1,
"1": 2,
"2": 3,
"3": 4,
"4": 5
},
"Y": {
"0": 2202.34070733524,
"1": 336.602082171235,
"2": 2820.0530112481106,
"3": 2935.89805580997,
"4": 3236.75098902635
}
})
我正在尝试创建一个距离矩阵,该矩阵表示在每个CityIds
之间移动的成本。使用pdist
中的squareform
和scipy.spatial.distance
,我可以执行以下操作:
from scipy.spatial.distance import pdist, squareform
df_m = pd.DataFrame(
squareform(
pdist(
df[['CityId', 'X', 'Y']].iloc[:, 1:],
metric='euclidean')
),
index=df.CityId.unique(),
columns= df.CityId.unique()
)
这使用从CityIds
计算出的成对距离,得出了所有pdist
之间的距离矩阵。
我想将elevation_meters
合并到此距离矩阵中。这样做的有效方法是什么?
答案 0 :(得分:1)
您可以尝试scipy.spatial.distance_matrix
:
xx = df[['X','elevation_meters', 'Y']]
pd.DataFrame(distance_matrix(xx,xx), columns= df['CityId'],
index=df['CityId'])
输出:
CityId 0 1 2 3 4
CityId
0 0.000000 4468.691544 3197.555070 4432.386687 1245.577226
1 4468.691544 0.000000 2649.512402 2617.799439 4443.602402
2 3197.555070 2649.512402 0.000000 1239.367465 2478.738402
3 4432.386687 2617.799439 1239.367465 0.000000 3689.688537
4 1245.577226 4443.602402 2478.738402 3689.688537 0.000000