在熊猫数据框中获取最小距离的索引

时间:2018-11-19 17:29:06

标签: python pandas

我的数据头如下:

   lat1 long1 state           county   lat2  long2  dist Depot
0     .     .    AK   Aleutians West   11.0   23.0  121   0.0
1     .     .    AK     Wade Hampton   33.0   11.0  12    0.0
2     .     .    AK      North Slope   55.0   11.0  43    0.0
3     .     .    AK  Kenai Peninsula   44.0   11.0  43    0.0
4     .     .    AK        Anchorage   11.0   11.0  99    0.0
5     1     2    AK        Anchorage    NaN    NaN  1e10  0.0
6     .     .    AK        Anchorage   55.0   44.0  33    0.0
7     3     4    AK        Anchorage    NaN    NaN  1e10  0.0
8     .     .    AK        Anchorage    3.0    2.0  23    0.0
9     .     .    AK        Anchorage    5.0   11.0  32    0.0
10    .     .    AK        Anchorage   42.0   22.0  33    0.0
11    .     .    AK        Anchorage   11.0    2.0  33    0.0
12    .     .    AK        Anchorage  444.0    1.0  43    0.0
13    .     .    AK        Anchorage    1.0    2.0  53    0.0
14    0     2    AK        Anchorage    NaN    NaN  1e10  0.0
15    .     .    AK        Anchorage    1.0    1.0  32    0.0
16    .     .    AK        Anchorage  111.0   11.0  22    0.0

距离列指示相应的lat2/long2lat1/long1中所有条目之间的最小距离。我的代码如下:

import pandas as pd
from haversine import haversine
import numpy as np

path = 'distance.xlsx'
df = pd.read_excel(path, parse_cols = [1,2,4,5,7,8])
df = df.assign(dist=pd.Series(np.zeros(27055)).values);
df = df.assign(Depot=pd.Series(np.zeros(27055)).values);


temp = 1e10
for i in range(0, len(df)):
    for j in range(0, len(df)):
        if  df.iloc[j]['lat1'] is '.' and df.iloc[j]['long1'] is '.':
            continue
        else:     
            p1 = (df.iloc[j]['lat1'],df.iloc[j]['long1'])
            p2 = (df.iloc[i]['lat2'],df.iloc[i]['long2'])
            df.dist.iloc[i] = min(temp,haversine(p1, p2, miles=True))
            temp = df.dist.iloc[i]


    temp = 1e10

对于每个索引的最小距离,我想知道相应的p2 (lat1,long1)并将其存储在Depot列中。我知道这一定是一些索引命令,但是我正在努力寻找确切的形式。

0 个答案:

没有答案