KDTree(或类似的东西)用于地理空间和时间搜索

时间:2018-07-19 08:55:12

标签: python pandas geospatial kdtree

我有一系列道路测量环路的坐标,这些坐标具有GPS坐标和测量时间((环路,时间,纬度,经度)元组的列表)。我需要将行驶中的汽车((时间,纬度,经度的元组列表)匹配到这些循环,以便对于每个时间戳,我都找到最接近的GPS坐标,其中汽车的时间戳位于从循环。

我曾考虑使用3d KDTree(时间,经度,时间),但令我吃惊的是,它将所有尺寸都一视同仁,因此它可以找到距离汽车不远的环路,但时间戳确实很近,这不是我所需要的。

edit:数据看起来像这样(出于当前目的,匹配start上的循环就足够了);我要寻找的是loops框架中每个匹配行的trip框架索引:

from io import StringIO
import pandas as pd

loops = StringIO("""
start,end,flow,speed,lat,long
2018-04-06 13:15:00,2018-04-06 13:16:00,,80.0,52.2862,6.766839999999999
2018-04-06 13:15:00,2018-04-06 13:16:00,,80.0,52.2765,6.83472
2018-04-06 13:15:00,2018-04-06 13:16:00,,72.0,52.2765,6.83472
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2765,6.83472
2018-04-06 13:15:00,2018-04-06 13:16:00,,78.66999799999999,52.2765,6.83472
2018-04-06 13:15:00,2018-04-06 13:16:00,,83.0,52.2939,6.885
2018-04-06 13:15:00,2018-04-06 13:16:00,,74.0,52.2939,6.885
2018-04-06 13:15:00,2018-04-06 13:16:00,,61.0,52.2939,6.885
2018-04-06 13:15:00,2018-04-06 13:16:00,,78.559998,52.2939,6.885
2018-04-06 13:15:00,2018-04-06 13:16:00,,75.0,52.2882,6.92357
2018-04-06 13:15:00,2018-04-06 13:16:00,,71.0,52.2882,6.92357
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2882,6.92357
2018-04-06 13:15:00,2018-04-06 13:16:00,,74.199997,52.2882,6.92357
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3086,7.004910000000001
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3086,7.004910000000001
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3086,7.004910000000001
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3086,7.004910000000001
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2938,6.88589
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2938,6.88589
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2938,6.88589
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2938,6.88589
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2878,6.92747
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2878,6.92747
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2878,6.92747
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2878,6.92747
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3092,7.00833
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3092,7.00833
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3092,7.00833
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3092,7.00833
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2958,6.88825
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2958,6.88825
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2958,6.88825
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2958,6.88825
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3108,7.014589999999999
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3108,7.014589999999999
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3108,7.014589999999999
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.3108,7.014589999999999
2018-04-06 13:15:00,2018-04-06 13:16:00,,82.0,52.2952,6.88543
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2952,6.88543
2018-04-06 13:15:00,2018-04-06 13:16:00,,84.0,52.2952,6.88543
2018-04-06 13:15:00,2018-04-06 13:16:00,,82.16999799999999,52.2952,6.88543
2018-04-06 13:15:00,2018-04-06 13:16:00,,92.0,52.2886,6.92391
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2886,6.92391
2018-04-06 13:15:00,2018-04-06 13:16:00,,-1.0,52.2886,6.92391
""".strip())
loops = pd.read_csv(loops)
loops['start'] = pd.to_datetime(loops['start'])
loops['end'] = pd.to_datetime(loops['end'])

trip = StringIO("""
timestamp,GPS_latitude_deg,GPS_longitude_deg
2018-04-06 13:14:54,52.0252768,4.6577968
2018-04-06 13:14:54.200000,52.0252768,4.6577972
2018-04-06 13:14:54.400000,52.0252768,4.6577968
2018-04-06 13:14:54.600000,52.0252768,4.6577964
2018-04-06 13:14:54.800000,52.0252768,4.657796
2018-04-06 13:14:55,52.0252768,4.657796
2018-04-06 13:14:55.200000,52.0252768,4.657796
2018-04-06 13:14:55.400000,52.0252768,4.657796
2018-04-06 13:14:55.600000,52.0252768,4.657796
2018-04-06 13:14:55.800000,52.0252768,4.657796
2018-04-06 13:14:56,52.0252768,4.657796
2018-04-06 13:14:56.200000,52.0252768,4.657796
2018-04-06 13:14:56.400000,52.0252768,4.6577964
2018-04-06 13:14:56.600000,52.0252768,4.657796
2018-04-06 13:14:56.800000,52.0252768,4.6577956
2018-04-06 13:14:57,52.0252768,4.657796
2018-04-06 13:14:57.200000,52.0252768,4.657796
2018-04-06 13:14:57.400000,52.0252768,4.657796
2018-04-06 13:14:57.600000,52.0252768,4.657796
2018-04-06 13:14:57.800000,52.0252768,4.657796
2018-04-06 13:14:58,52.0252768,4.657796
2018-04-06 13:14:58.200000,52.0252768,4.657796
2018-04-06 13:14:58.400000,52.0252768,4.657796
2018-04-06 13:14:58.600000,52.0252768,4.657796
2018-04-06 13:14:58.800000,52.0252768,4.657796
2018-04-06 13:14:59,52.0252768,4.6577964
2018-04-06 13:14:59.200000,52.0252768,4.6577964
2018-04-06 13:14:59.400000,52.0252768,4.6577964
2018-04-06 13:14:59.600000,52.0252768,4.6577968
2018-04-06 13:14:59.800000,52.0252768,4.6577968
2018-04-06 13:15:00,52.0252768,4.6577972
2018-04-06 13:15:00.200000,52.0252768,4.6577976
2018-04-06 13:15:00.400000,52.0252736,4.6577976
2018-04-06 13:15:00.600000,52.0252736,4.657798
2018-04-06 13:15:00.800000,52.0252736,4.6577984
2018-04-06 13:15:01,52.0252736,4.6577984
""".strip())
trip = pd.read_csv(trip)
trip['timestamp'] = pd.to_datetime(trip['timestamp'])

0 个答案:

没有答案