计算坐标对之间的最小距离

时间:2016-08-19 17:19:27

标签: python dataframe

第一个数据帧df1包含id及其对应的两个坐标。对于第一个数据帧中的每个坐标对,我必须遍历第二个数据帧以找到距离最小的数据帧。我尝试了单独的坐标并找到它们之间的距离,但它没有按预期工作。我相信在找到它们之间的距离时必须将其作为一对。不确定Python是否提供了一些实现此目的的方法。

例如:df1

Id        Co1            Co2
334    30.371353      -95.384010
337    39.497448      -119.789623

DF2

Id       Co1             Co2
339    40.914585      -73.892456
441    34.760395      -77.999260

dfloc3 =[[38.991512-77.441536],
         [40.89869-72.37637],
         [40.936115-72.31452],
         [30.371353-95.38401],
         [39.84819-75.37162],
         [36.929306-76.20035],
         [40.682342-73.979645]]


dfloc4 = [[40.914585,-73.892456],
          [41.741543,-71.406334],
          [50.154522,-96.88806],
          [39.743565,-121.795761],
          [30.027597,-89.91014],
          [36.51881,-82.560844],
          [30.449587,-84.23629],
          [42.920475,-85.8208]]

2 个答案:

答案 0 :(得分:1)

以下代码在df1中创建了一个新列,显示df2中最近点的ID。 (我不能从问题中判断出这是否是你想要的。)我假设坐标位于欧几里德空间,即点之间的距离由毕达哥拉斯定理给出。如果没有,您可以轻松使用其他计算而不是dist_squared

import pandas as pd

df1 = pd.DataFrame(dict(Id=[334, 337], Co1=[30.371353, 39.497448], Co2=[-95.384010, -119.789623]))
df2 = pd.DataFrame(dict(Id=[339, 441], Co1=[40.914585, 34.760395], Co2=[-73.892456, -77.999260]))

def nearest(row, df):
    # calculate euclidian distance from given row to all rows of df
    dist_squared = (row.Co1 - df.Co1) ** 2 + (row.Co2 - df.Co2) ** 2
    # find the closest row of df
    smallest_idx = dist_squared.argmin()
    # return the Id for the closest row of df
    return df.loc[smallest_idx, 'Id']

near = df1.apply(nearest, args=(df2,), axis=1)

df1['nearest'] = near

答案 1 :(得分:1)

鉴于你可以将你的积分变成这样的列表......

df1 = [[30.371353, -95.384010], [39.497448, -119.789623]]
df2 = [[40.914585, -73.892456], [34.760395, -77.999260]]

导入数学然后创建一个函数,以便更容易找到距离:

import math    

def distance(pt1, pt2):
    return math.sqrt((pt1[0] - pt2[0])**2 + (pt1[1] - pt2[1])**2)

然后简单地横切你的列表,保存最近的点:

for pt1 in df1:
    closestPoints = [pt1, df2[0]]
    for pt2 in df2:
        if distance(pt1, pt2) < distance(closestPoints[0], closestPoints[1]):
            closestPoints = [pt1, pt2]
    print ("Point: " + str(closestPoints[0]) + " is closest to " + str(closestPoints[1]))

输出:

Point: [30.371353, -95.38401] is closest to [34.760395, -77.99926]
Point: [39.497448, -119.789623] is closest to [34.760395, -77.99926]