使用 GeoPandas 将坐标列表转换为多边形

时间:2021-02-18 18:16:06

标签: python pandas geopandas

我收到了 lists of coordinates in the csv file(请点击图片)。我应该如何将它们转换为 GeoDataFrame 中的多边形?

下面是一个多边形的坐标,我有数千行。

[118.103198,24.527338],[118.103224,24.527373],[118.103236,24.527366],[118.103209,24.527331],[118.103198,24.527338]

我尝试了以下代码:

def bike_fence_format(s):
    s = s.replace('[', '').replace(']', '').split(',')
    return s

df['FENCE_LOC'] = df['FENCE_LOC'].apply(bike_fence_format)

df['LAT'] = df['FENCE_LOC'].apply(lambda x: x[1::2])
df['LON'] = df['FENCE_LOC'].apply(lambda x: x[::2])

df['geom'] = Polygon(zip(df['LON'].astype(str),df['LAT'].astype(str)))

但是我在最后一步失败了,因为 df['LON'] 返回的是 'series' 而不是 'string' 类型。我应该如何克服这个问题?如果有更简单的方法来实现我的目标就更好了。

2 个答案:

答案 0 :(得分:1)

重新创建 .csv 文件将提供的示例 df(取决于您使用 .read_csv() 读取它的方式)。

import pandas as pd
import geopandas as gpd

df = pd.DataFrame({'FENCE_LOC': ['[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]', 
                  '[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]', 
                  '[32250,175889],[33913,180757],[29909,182124],[28246,177257],[32250,175889]']}, index=[0, 1, 2])

稍微修改你的函数,因为我们想要数值,而不是字符串

def bike_fence_format(s):
    s = s.replace('[', '').replace(']', '').split(',')
    s = [float(x) for x in s]

    return s

df['FENCE_LOC'] = df['FENCE_LOC'].apply(bike_fence_format)


df['LAT'] = df['FENCE_LOC'].apply(lambda x: x[1::2])
df['LON'] = df['FENCE_LOC'].apply(lambda x: x[::2])

我们可以使用一些列表推导式来构建一个 Shapely 多边形列表。

geom_list = [(x, y) for x, y  in zip(df['LON'],df['LAT'])]

geom_list_2 = [Polygon(tuple(zip(x, y))) for x, y in geom_list]

最后,我们可以使用 Shapely 多边形列表创建一个 gdf。

polygon_gdf =  gpd.GeoDataFrame(geometry=geom_list_2)

enter image description here

答案 1 :(得分:1)

为了提供类似于 OP 作为图像发布的内容的小型代表性数据集,我创建了以下数据行(抱歉十进制数字太多):

[[-2247824.100899419,-4996167.43201861],[-2247824.100899419,-4996067.43201861],[-2247724.100899419,-4996067.43201861],[-2247724.100899419,-4996167.43201861],[-2247824.100899419,-4996167.43201861]]
[[-2247724.100899419,-4996167.43201861],[-2247724.100899419,-4996067.43201861],[-2247624.100899419,-4996067.43201861],[-2247624.100899419,-4996167.43201861],[-2247724.100899419,-4996167.43201861]]
[[-2247624.100899419,-4996167.43201861],[-2247624.100899419,-4996067.43201861],[-2247524.100899419,-4996067.43201861],[-2247524.100899419,-4996167.43201861],[-2247624.100899419,-4996167.43201861]]
[[-2247824.100899419,-4996067.43201861],[-2247824.100899419,-4995967.43201861],[-2247724.100899419,-4995967.43201861],[-2247724.100899419,-4996067.43201861],[-2247824.100899419,-4996067.43201861]]
[[-2247724.100899419,-4996067.43201861],[-2247724.100899419,-4995967.43201861],[-2247624.100899419,-4995967.43201861],[-2247624.100899419,-4996067.43201861],[-2247724.100899419,-4996067.43201861]]
[[-2247624.100899419,-4996067.43201861],[-2247624.100899419,-4995967.43201861],[-2247524.100899419,-4995967.43201861],[-2247524.100899419,-4996067.43201861],[-2247624.100899419,-4996067.43201861]]
[[-2247824.100899419,-4995967.43201861],[-2247824.100899419,-4995867.43201861],[-2247724.100899419,-4995867.43201861],[-2247724.100899419,-4995967.43201861],[-2247824.100899419,-4995967.43201861]]
[[-2247724.100899419,-4995967.43201861],[-2247724.100899419,-4995867.43201861],[-2247624.100899419,-4995867.43201861],[-2247624.100899419,-4995967.43201861],[-2247724.100899419,-4995967.43201861]]
[[-2247624.100899419,-4995967.43201861],[-2247624.100899419,-4995867.43201861],[-2247524.100899419,-4995867.43201861],[-2247524.100899419,-4995967.43201861],[-2247624.100899419,-4995967.43201861]]

此数据保存为 polygon_data.csv 文件。

对于代码,模块首先加载为

import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon

然后,通过pandas.read_csv()读取数据以创建数据帧。要将每一行数据放入数据框的单列中,使用 delimiter="x"。由于任何一行数据中都没有x,所以整行数据作为一个长字符串就是结果。

df3 = pd.read_csv('polygon_data.csv', header=None, index_col=None, delimiter="x")

要查看df3的内容,可以运行

df3.head()

并获取单列(标题:0)数据框:

                                                   0
0  [[-2247824.100899419,-4996167.43201861],[-2247...
1  [[-2247724.100899419,-4996167.43201861],[-2247...
2  [[-2247624.100899419,-4996167.43201861],[-2247...
3  [[-2247824.100899419,-4996067.43201861],[-2247...
4  [[-2247724.100899419,-4996067.43201861],[-2247...

接下来,df3 用于创建 geoDataFrame。 df3 的每一行中的数据用于创建一个 Polygon 对象,作为 geoDataFrame geometrypolygon_df3

geometry = [Polygon(eval(xy_string)) for xy_string in df3[0]]
polygon_df3 = gpd.GeoDataFrame(df3, \
                   #crs={'init': 'epsg:4326'}, #uncomment this if (x,y) is long/lat
                   geometry=geometry)

最后,可以使用一个简单的命令绘制 geoDataFrame:

# this plot the geoDataFrame
polygon_df3.plot(edgecolor='black')

在我建议的数据的这种特殊情况下,输出图是:

3x3sq