将点转换为线地理熊猫

时间:2018-06-27 21:03:01

标签: python pandas geopandas

您好,我正在尝试将X和Y坐标列表转换为行。我想通过groupby的ID以及时间来映射此数据。只要我grouby一列,我的代码就会成功执行,但是两列是我遇到错误的地方。我引用了此question

以下是一些示例数据:

ID  X           Y           Hour
1   -87.78976   41.97658    16
1   -87.66991   41.92355    16
1   -87.59887   41.708447   17
2   -87.73956   41.876827   16
2   -87.68161   41.79886    16
2   -87.5999    41.7083     16
3   -87.59918   41.708485   17
3   -87.59857   41.708393   17
3   -87.64391   41.675133   17

这是我的代码:

df = pd.read_csv("snow_gps.csv", sep=';')

#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = GeoDataFrame(df, geometry=geometry)

# aggregate these points with the GrouBy
geo_df = geo_df.groupby(['track_seg_point_id', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df = GeoDataFrame(geo_df, geometry='geometry')

这是错误:ValueError:LineStrings必须至少包含2个坐标元组

这是我想要得到的最终结果:

ID          Hour     geometry
1           16       LINESTRING (-87.78976 41.97658, -87.66991 41.9... 
1           17       LINESTRING (-87.78964000000001 41.976634999999... 
1           18       LINESTRING (-87.78958 41.97663499999999, -87.6... 
2           16       LINESTRING (-87.78958 41.976612, -87.669785 41... 
2           17       LINESTRING (-87.78958 41.976624, -87.66978 41.... 
3           16       LINESTRING (-87.78958 41.97666, -87.6695199999... 
3           17       LINESTRING (-87.78954 41.976665, -87.66927 41.... 

对于如何对多个参数进行分组,请提供任何建议或想法。

1 个答案:

答案 0 :(得分:1)

您的代码很好,问题出在您的数据上。

您可以看到,如果按ID和Hour分组,则只有1个点以ID 1和小时17分组。LineString必须包含1个或多个Point(必须至少包含1个Point) 2个坐标元组)。我在您的样本数据中添加了另一点:

    ID   X          Y           Hour
    1   -87.78976   41.97658    16
    1   -87.66991   41.92355    16
    1   -87.59887   41.708447   17
    1   -87.48234   41.677342   17
    2   -87.73956   41.876827   16
    2   -87.68161   41.79886    16
    2   -87.5999    41.7083     16
    3   -87.59918   41.708485   17
    3   -87.59857   41.708393   17
    3   -87.64391   41.675133   17

并且您将在下面的代码中看到与您的代码几乎相同的

    import pandas as pd
    import geopandas as gpd
    from shapely.geometry import Point, LineString, shape

    df = pd.read_csv("snow_gps.csv", sep='\s*,\s*')

    #zip the coordinates into a point object and convert to a GeoData Frame
    geometry = [Point(xy) for xy in zip(df.X, df.Y)]
    geo_df = gpd.GeoDataFrame(df, geometry=geometry)

    geo_df2 = geo_df.groupby(['ID', 'Hour'])['geometry'].apply(lambda x:                 LineString(x.tolist()))
    geo_df2 = gpd.GeoDataFrame(geo_df2, geometry='geometry')