我正在寻找命令行解决方案,以便从CSV坐标列表中找到最近的点集。
Here这是Excel的答案,但我需要一个不同的解决方案。
我不是为每个点寻找最近的点,而是寻找彼此距离最小的点对。
我想匹配GEO的许多发电厂,所以(python?)命令行工具会很棒。
以下是一个示例数据集:
Chicoasén Dam,16.941064,-93.100828
Tuxpan Oil Power Plant,21.014891,-97.334492
Petacalco Coal Power Plant,17.983575,-102.115252
Angostura Dam,16.401226,-92.778926
Tula Oil Power Plant,20.055825,-99.276857
Carbon II Coal Power Plant,28.467176,-100.698559
Laguna Verde Nuclear Power Plant,19.719095,-96.406347
Carbón I Coal Power Plant,28.485238,-100.69096
Manzanillo I Oil Power Plant,19.027372,-104.319274
Tamazunchale Gas Power Plant,21.311282,-98.756266
该工具应打印" Carbon II"和"碳I",因为这对具有最小距离。
代码片段可以是:
from math import radians, cos, sin, asin, sqrt
import csv
def haversine(lon1, lat1, lon2, lat2):
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
km = 6371 * c
return km
with open('mexico-test.csv', newline='') as csvfile:
so = csv.reader(csvfile, delimiter=',', quotechar='|')
data = []
for row in so:
data.append(row)
print(haversine(28.467176,-100.698559,28.485238,-100.69096))
答案 0 :(得分:0)
一种简单的方法是计算所有对,然后找到最小对,其中"大小"一对中的一对被定义为该对中两点之间的距离:
from itertools import combinations
closest = min(combinations(data, 2),
key=lambda p: haversine(float(p[0][1]), float(p[0][2]), float(p[1][1]), float(p[1][2])))
要获得最小的五个,请使用具有相同密钥的堆。
import heap
pairs = list(combinations(data, 2))
heap.heapify(pairs)
five_smallest = heapq.nsmallest(
5,
combinations(data, 2),
key=lambda p: haversine(float(p[0][1]), float(p[0][2]), float(p[1][1]), float(p[1][2])))